www.artificialintelligenceupdate.com

Learn GraphRAG with Python, Ollama, and NetworkX

Imagine a world where AI understands not just the words you say,
but also the intricate relationships between the concepts. This is the vision behind GraphRAG, a groundbreaking technique that leverages the power of graph theory to enhance the capabilities of Retrieval Augmented Generation (RAG). By representing information as interconnected nodes and edges, GraphRAG empowers AI models to delve deeper into the fabric of knowledge, leading to more accurate, comprehensive, and contextually relevant responses.


1. Introduction to Large Language Models (LLMs)

  • What are LLMs?
    LLMs, like GPT, are models trained on vast amounts of text to generate human-like text. They can understand and generate language based on prompts provided by users.
  • Context Window in LLMs
    The context window refers to the amount of information (in tokens or words) that an LLM can consider at a time while generating responses. Think of it like a short-term memory limit.
  • Context Window Limit
    The window is limited by design, meaning the model can only "remember" or take into account a certain amount of input at once. This limitation impacts how well it can respond to queries, especially when the input is long or complex.

2. Why Retrieval Augmented Generation (RAG) is Required

  • The Problem
    When users ask LLMs questions, the information may not fit within the limited context window. As a result, the LLM might give incomplete or incorrect answers.
  • What is RAG?
    Retrieval-Augmented Generation (RAG) solves this by combining LLMs with external data sources. Instead of relying solely on the model’s internal knowledge, RAG retrieves relevant information from databases or documents before generating a response.
  • How RAG Works
    • Retrieval: When a query is made, RAG retrieves relevant chunks of text from external sources.
    • Augmentation: These retrieved documents are then fed into the LLM’s context window.
    • Generation: The LLM uses both the input and the retrieved documents to create a response.

3. Shortcomings of RAG

  • Challenges with Relevant Information
    RAG doesn’t always retrieve the most relevant data, leading to incoherent or irrelevant answers.
  • Efficiency
    Retrieving and processing large documents can be computationally expensive.
  • Context Switching
    When the retrieval process pulls in too many chunks of data, the model might struggle to maintain context, resulting in disjointed or inaccurate responses.

4. Solutions: Semantic Chunking, Ranking, and Re-ranking

  • Semantic Chunking
    Breaks down large documents into meaningful "chunks" based on content. This helps in retrieving smaller, more relevant parts of a document.
  • Ranking
    After retrieval, the system ranks the chunks based on their relevance to the query.
  • Re-ranking
    Uses machine learning algorithms to re-rank the retrieved documents to ensure that the most useful information is prioritized.

5. Issues that Still Persist

  • Complex Queries
    RAG still struggles with highly complex, multi-part questions that require a deep understanding of multiple documents.
  • Scaling
    As the size of external knowledge sources grows, retrieval efficiency and relevance can degrade.

6. Introduction to Graph Theory and Graph Databases

  • Graph Theory Basics
    In graph theory, data is represented as nodes (entities) and edges (relationships between entities). This allows complex relationships to be modeled in a highly structured way.
  • Graph Databases
    Unlike traditional databases, graph databases store data in the form of nodes and edges, making it easier to traverse relationships and retrieve connected information.

7. How Graph Databases Work

  • Nodes and Edges
    Nodes represent entities, while edges represent relationships between these entities. Graph queries allow for fast and intuitive exploration of connections, which can be helpful in retrieving contextual data.
  • Graph Algorithms
    Graph databases often employ algorithms like depth-first search or breadth-first search to efficiently find related data based on a query.

8. What is GraphRAG?

  • Initial Concept
    GraphRAG combines graph theory with RAG to improve how information is retrieved and related across datasets. It enhances the retrieval process by mapping the relationships between pieces of data.
  • How GraphRAG Works

    • Graph-Based Retrieval: Instead of relying solely on document-level retrieval, GraphRAG uses graph databases to retrieve data based on the relationships between entities. This provides more contextually relevant data.

    • Traversing the Graph: Queries traverse the graph to identify not just relevant data but also data that is related through nodes and edges.

    • Improved Augmentation: This graph-based approach helps the LLM to understand not just the isolated pieces of information but also how they are related, improving the quality of generated responses.

Prerequisites

Before diving into the tutorial, ensure you have the following installed:

  1. Python: Version 3.6 or higher. You can download it from the official Python website.

  2. Ollama: An AI framework designed for building and deploying large language models. More information can be found on the Ollama website.

  3. NetworkX: A Python library for the creation, manipulation, and study of the structure and dynamics of complex networks. You can find it on NetworkX’s GitHub page or its official documentation.

We have already created a GitHub repo to get you started with GraphRAG:

To get started please visit this GitHub repo and clone it. For advanced users the code is given below

Step 1: Setting Up Your project directory and virtual environment

  1. Create a Directory: The command mkdir ./graphrag/ creates a new directory named graphrag in the current working directory. This directory will be used to store all files related to the GraphRAG project.

  2. Change Directory: The command cd ./graphrag/ changes the current working directory to the newly created graphrag directory. This ensures that any subsequent commands are executed within this directory.

  3. Create a Virtual Environment: The command python -m venv graphrag creates a virtual environment named graphrag within the current directory. A virtual environment is an isolated environment that allows you to manage dependencies for your project without affecting the global Python installation.

  4. Activate the Virtual Environment: The command python source/graphrag/bin/activate is intended to activate the virtual environment. However, the correct command for activation is typically source graphrag/bin/activate on Unix-like systems or graphragScriptsactivate on Windows. Activating the virtual environment modifies your shell’s environment to use the Python interpreter and packages installed in that environment.

Following these steps prepares your workspace for developing with GraphRAG, ensuring that dependencies are managed within the isolated environment.

mkdir ./graphrag/
cd ./graphrag/
python -m venv graphrag
python source/graphrag/bin/activate

Step 2: Collecting all the required dependencies

We have already made a requirements.txt file that has all the dependencies.


cd ./SimpleGRAPHRAG/
pip install -r requirements.txt

Make sure you have all the required libraries installed, as they will be essential for the steps that follow.

Step 3: Constructing a Knowledge Graph of sentences and embeddings with NetworkX & Ollama

In this step, we will create a set of fucntions that will read files,break them down from a whole book to every signle word in that book use RAKE algorithm to find the main keyword for each node in the network then create vector embeddings for all the nodes and store it in a graph. Read this ReadME to better understand how all the functions work.

import os
from typing import Tuple
import pickle
import ollama
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
import concurrent.futures
import re
import PyPDF2
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
from rake_nltk import Rake

# Ensure you have the necessary NLTK resources
import nltk
nltk.download('punkt')
nltk.download('stopwords')

## this function below reads files

def read_file(file_path):
    """Read the content of a Markdown or PDF file."""
    if file_path.endswith('.pdf'):
        with open(file_path, 'rb') as file:
            reader = PyPDF2.PdfReader(file)
            text = ''
            for page in reader.pages:
                text += page.extract_text() + 'n'
            return text
    elif file_path.endswith('.md') or file_path.endswith('.markdown'):
        with open(file_path, 'r', encoding='utf-8') as file:
            return file.read()
    else:
        raise ValueError("Unsupported file type. Please provide a Markdown or PDF file.")

# this function was intended for chapter finding but could not use it due to complexity

def detect_table_of_contents(text):
    """Detect the Table of Contents in the document."""
    toc_pattern = re.compile(r'^(Chapter Contents d+|[0-9]+. [A-Za-z0-9 .-]+)(?:s*-s*[0-9]+)?$', re.MULTILINE)
    toc_matches = toc_pattern.findall(text)
    return toc_matches

# Here comes the most  important function for this project, 
# this function forms the network of the graph by chunking pages to paragraphs to senteces to words 
# and generating embeddings for each or them
# then it will find the keywords using RAKE Keyword extrantion algorithm
# giving us a knowledge graph
# this is a crude implementation hence the graph will be dense and process will take time
# If you manually give it the chapter names It will be blazing fast

def split_text_into_sections(text):
    """Split text into chapters, pages, paragraphs, sentences, and words."""

    def split_text(text, delimiters):
        """Split text using multiple delimiters."""
        # Create a regex pattern that matches any of the delimiters
        pattern = '|'.join(map(re.escape, delimiters))
        return re.split(pattern, text)

    chapternames = ["Bioprocess Development: An Interdisciplinary Challenge",
                    "Introduction to Engineering Calculations",
                    "Presentation and Analysis of Data",
                    "Material Balances",
                    "Energy Balances",
                    "Unsteady-State Material and Energy Balances",
                    "Fluid Flow and Mixing",
                    "Heat Transfer",
                    "Mass Transfer",
                    "Unit Operations",
                    "Homogeneous Reactions",
                    "Heterogeneous Reactions",
                    "Reactor Engineering",
                    "Appendices",
                    "Appendix A Conversion Factors",
                    "Appendix B Physical and Chemical Property Data",
                    "Appendix C Steam Tables",
                    "Appendix D Mathematical Rules",
                    "Appendix E List of Symbols",
                    "Index",
                    'A Special Tree', 'The School Among the Pines', 
                  'The Wind on Haunted Hill', 'Romi and the Wildfire', 'Tiger My Friend', 
                  'Monkey Trouble', 'Snake Trouble', 'Those Three Bears', 'The Coral Tree', 
                  "The Thief's Story", 'When the Trees Walked', 'Goodbye, Miss Mackenzie', 
                  'Pret in the House', 'The Overcoat', 'The Tunnel', 'Wild Fruit', 
                  'The Night the Roof Blew Off', "A Traveller's Tale", 'And Now We are Twelve']  # List of chapters already given for making it fast

    chapters = split_text(text,chapternames) # deactivate if not using the Biochem.md or rb.md
    #chapters=text.split('Chapter')  # activate if not using the Biochem.md
    graph = nx.Graph()
    stop_words = set(stopwords.words('english'))  # Load English stopwords

    def process_chapter(chapter):
        """Process a single chapter into pages, paragraphs, sentences, and words."""
        pages = chapter.split('nn')  # Assuming pages are separated by double newlines
        for page in pages:
            paragraphs = re.split(r'n+', page)  # Split into paragraphs
            for paragraph in paragraphs:
                sentences = sent_tokenize(paragraph)  # Split into sentences using NLTK
                for sentence in sentences:
                    words = word_tokenize(sentence)  # Split into words using NLTK
                    filtered_words = [word for word in words if word.lower() not in stop_words]  # Remove stopwords

                    # Create nodes in the graph
                    graph.add_node(sentence)
                    sentence_embedding = get_embedding(sentence)
                    graph.nodes[sentence]['embedding'] = sentence_embedding  # Store embedding in the graph

                    for word in filtered_words:
                        graph.add_node(word)
                        graph.add_edge(sentence, word)  # Connect sentence to its words

                    # Extract keywords using RAKE
                    r = Rake()
                    r.extract_keywords_from_text(sentence)
                    keywords = r.get_ranked_phrases()
                    graph.nodes[sentence]['keywords'] = keywords  # Store keywords in the graph
                    for keyword in keywords:
                        graph.add_node(keyword)
                        keyword_embedding = get_embedding(keyword)
                        graph.nodes[keyword]['embedding'] = keyword_embedding  # Store embedding in the graph
                        graph.add_edge(sentence, keyword)  # Connect sentence to its keywords

                graph.add_edge(page, paragraph)  # Connect page to its paragraphs
            graph.add_edge(chapter, page)  # Connect chapter to its pages

    # Use multithreading to process chapters
    with concurrent.futures.ThreadPoolExecutor() as executor:
        futures = [executor.submit(process_chapter, chapter) for chapter in chapters]
        for future in concurrent.futures.as_completed(futures):
            try:
                future.result()  # Wait for the chapter processing to complete
            except Exception as e:
                print(f"Error processing chapter: {e}")

    return graph

# GraphRAG takes a lot of time to calculate on big books so we will save the graphs as pickle

def save_graph(graph, filepath):
    """Save the graph to a specified file path using pickle."""
    # Check if the filepath is a directory or a file
    if os.path.isdir(filepath):
        raise ValueError("Please provide a file name along with the directory path.")

    # Check if the file path ends with .gpickle
    if not filepath.endswith('.gpickle'):
        raise ValueError("File must have a .gpickle extension.")

    # Ensure the directory exists
    os.makedirs(os.path.dirname(filepath), exist_ok=True)

    # Save the graph using pickle
    with open(filepath, 'wb') as f:
        pickle.dump(graph, f, pickle.HIGHEST_PROTOCOL)
    print(f"Graph saved to {filepath}")

# load the saved graph for future use

def load_graph(filepath):
    """Load the graph from a specified file path using pickle."""
    # Check if the file exists
    if not os.path.isfile(filepath):
        raise FileNotFoundError(f"No such file: '{filepath}'")

    # Check if the file path ends with .gpickle
    if not filepath.endswith('.gpickle'):
        raise ValueError("File must have a .gpickle extension.")

    # Load the graph using pickle
    with open(filepath, 'rb') as f:
        graph = pickle.load(f)
    print(f"Graph loaded from {filepath}")
    return graph

# The embedding Function

def get_embedding(text, model="mxbai-embed-large"):
    """Get embedding for a given text using Ollama API."""
    response = ollama.embeddings(model=model, prompt=text)
    return response["embedding"]

# This function below gets the similarity of keywords in question with the huge text

def calculate_cosine_similarity(chunk, query_embedding, embedding):
    """Calculate cosine similarity between a chunk and the query."""
    if np.linalg.norm(query_embedding) == 0 or np.linalg.norm(embedding) == 0:
        return (chunk, 0)  # Handle zero vectors
    cosine_sim = np.dot(query_embedding, embedding) / (np.linalg.norm(query_embedding) * np.linalg.norm(embedding))
    return (chunk, cosine_sim)

# The Retrival portion of the graphrag

def find_most_relevant_chunks(query, graph):
    """Find the most relevant chunks based on the graph and cosine similarity to the query."""
    # Step 1: Extract keywords from the query using RAKE
    r = Rake()
    r.extract_keywords_from_text(query)
    keywords = r.get_ranked_phrases()

    # Step 2: Find relevant sentences in the graph based on keywords
    relevant_sentences = set()
    for keyword in keywords:
        for node in graph.nodes():
            if keyword.lower() in node.lower():  # Check if keyword is in the node
                relevant_sentences.add(node)  # Add the whole sentence

    # Step 3: Calculate embeddings for relevant sentences
    similarities = {}
    query_embedding = get_embedding(query)

    for sentence in relevant_sentences:
        if sentence in graph.nodes:
            embedding = graph.nodes[sentence].get('embedding')
            if embedding is not None:
                cosine_sim = calculate_cosine_similarity(sentence, query_embedding, embedding)
                similarities[sentence] = cosine_sim[1]  # Store only the similarity score

    # Sort sentences by similarity
    sorted_sentences = sorted(similarities.items(), key=lambda item: item[1], reverse=True)
    return sorted_sentences[:20]  # Return top 20 relevant sentences

# fetch the best answer

def answer_query(query, graph):
    """Answer a query using the graph and embeddings."""
    relevant_chunks = find_most_relevant_chunks(query, graph)
    context = " ".join(chunk for chunk, _ in relevant_chunks)  # Combine top chunks for context
    response = ollama.generate(model='mistral-nemo:latest', prompt=f"Context: {context} Question: {query}") ## Change the LLM to anyone of your Ollama LLM that has tool use and logical reasoning

    if 'response' in response:
        return response['response']
    else:
        return "No answer generated."

Core Components

  1. Text Processing: Converts input text into a hierarchical structure.
  2. Graph Creation: Builds a NetworkX graph from the processed text.
  3. Embedding Generation: Uses Ollama to generate embeddings for text chunks.
  4. Retrieval: Finds relevant chunks based on query similarity.
  5. Answer Generation: Uses a language model to generate answers based on retrieved context.

Detailed Function Explanations

read_file(file_path)

Reads content from Markdown or PDF files.

Parameters:

  • file_path: Path to the input file

Returns:

  • String containing the file content

detect_table_of_contents(text)

Attempts to detect a table of contents in the input text.

Parameters:

  • text: Input text

Returns:

  • List of detected table of contents entries

split_text_into_sections(text)

Splits the input text into a hierarchical structure and creates a graph.

Parameters:

  • text: Input text

Returns:

  • NetworkX graph representing the text structure

save_graph(graph, filepath) and load_graph(filepath)

Save and load graph structures to/from disk using pickle.

Parameters:

  • graph: NetworkX graph object

  • filepath: Path to save/load the graph

get_embedding(text, model="mxbai-embed-large")

Generates embeddings for given text using Ollama API.

Parameters:

  • text: Input text

  • model: Embedding model to use

Returns:

  • Embedding vector

calculate_cosine_similarity(chunk, query_embedding, embedding)

Calculates cosine similarity between chunk and query embeddings.

Parameters:

  • chunk: Text chunk
  • query_embedding: Query embedding vector
  • embedding: Chunk embedding vector

Returns:

  • Tuple of (chunk, similarity score)

find_most_relevant_chunks(query, graph)

Finds the most relevant chunks in the graph based on the query.

Parameters:

  • query: Input query

  • graph: NetworkX graph of the text

Returns:

  • List of tuples containing (chunk, similarity score)

answer_query(query, graph)

Generates an answer to the query using the graph and a language model.

Parameters:

  • query: Input query

  • graph: NetworkX graph of the text

Returns:

  • Generated answer string

visualize_graph(graph)

Visualizes the graph structure using matplotlib.

Parameters:

  • graph: NetworkX graph object

Example Usage


#save the graph

savefile=  "./graphs/st5.gpickle"           #input("enter path for saving the knowledge base:")
save_graph(graph, savefile)
# Load a graph
graph = load_graph("./graphs/sample_graph.gpickle")

# Ask a question
query = "What is the significance of the cherry seed in the story?"
answer = answer_query(query, graph)
print(f"Question: {query}")
print(f"Answer: {answer}")

Visualization

The visualize_graph function can be used to create a visual representation of the graph structure. This is useful for small to medium-sized graphs but may become cluttered for very large texts.It is multithreaded so it should work faster than normal python code.

# visualizer is now multi threaded for speed

def visualize_graph(graph):
    """Visualize the graph using Matplotlib with improved layout to reduce overlap."""
    def draw_canvas(figsize: Tuple[int, int]):
        print("fig draw starting")
        plt.figure(figsize=(90, 70))  # Adjust figure size for better visibility
        print("fig draw done nn")

    def draw_nodes(graph, pos):
        """Draw nodes in the graph."""
        print("node draw starts")
        nx.draw_networkx_nodes(graph, pos, node_size=1200, node_color='lightblue', alpha=0.7)
        print("node draw ends nn")

    def draw_edges(graph, pos):
        """Draw edges in the graph."""
        print("edge draw starts")
        nx.draw_networkx_edges(graph, pos, width=1.0, alpha=0.3)
        print("edge draw done nn")

    def draw_labels(graph, pos):
        """Draw labels in the graph."""
        print("drawing lables ")
        labels = {}
        for node in graph.nodes():
            keywords = graph.nodes[node].get('keywords', [])
            label = ', '.join(keywords[:3])  # Limit to the first 3 keywords for clarity
            labels[node] = label if label else node[:10] + '...'  # Fallback to node name if no keywords
        nx.draw_networkx_labels(graph, pos, labels, font_size=16)  # Draw labels with smaller font size
        print("lables drawn nn")

    draw_canvas(figsize=(90,90))

    # Use ThreadPoolExecutor to handle layout and rescaling concurrently
    with concurrent.futures.ThreadPoolExecutor() as executor:
        # Submit layout calculation
        future_pos = executor.submit(nx.kamada_kawai_layout, graph)
        pos = future_pos.result()  # Get the result of the layout calculation

        # Submit rescaling of the layout
        future_rescale = executor.submit(nx.rescale_layout_dict, pos, scale=2)
        pos = future_rescale.result()  # Get the result of the rescaling

    # Use ThreadPoolExecutor to draw nodes, edges, and labels concurrently
    with concurrent.futures.ThreadPoolExecutor() as executor:
        executor.submit(draw_nodes, graph, pos)
        executor.submit(draw_edges, graph, pos)
        executor.submit(draw_labels, graph, pos)
    plt.title("Graph Visualization of Text Chunks")
    plt.axis('off')  # Turn off the axis
    plt.tight_layout()  # Adjust spacing for better layout
    plt.show()

Limitations and Future Work

  1. The current implementation may be slow for very large texts.
  2. Graph visualization can be improved for better readability.
  3. More advanced graph algorithms could be implemented for better retrieval.
  4. Integration with other embedding models and language models could be explored.
  5. Inetegration of a database curation LLM that tries to form a priliminary answer from the database, can be used to make answers more accurate.

Conclusion

This tutorial has provided a comprehensive introduction to GraphRAG using Python, Ollama, and NetworkX. By creating a simple directed graph and integrating it with a language model, you can harness the power of graph-based retrieval to enhance the output of generative models. The combination of structured data and advanced AI techniques opens up new avenues for applications in various domains, including education, research, and content generation.

Feel free to expand upon this tutorial by adding more complex graphs, enhancing the retrieval logic, or integrating additional AI models as needed.

Key Points

  • GraphRAG combines graph structures with AI for enhanced data retrieval.
  • NetworkX is a powerful library for graph manipulation in Python.
  • Ollama provides capabilities for generative AI responses based on structured data.

This concludes the detailed tutorial on GraphRAG with Python, Ollama, and NetworkX. Happy coding!

For further reading, you may explore:


This edited version maintains clarity, provides proper citations, and ensures the content is free of errors, meeting the high standards expected for a well-structured blog post.

References

[1] https://github.com/severian42/GraphRAG-Local-UI

[2] https://pypi.org/project/graphrag/0.3.0/

[3] https://microsoft.github.io/graphrag/posts/get_started/

[4] https://www.youtube.com/watch?v=zDv8akdf6v4

[5] https://dev.to/stephenc222/implementing-graphrag-for-query-focused-summarization-47ib

[6] https://iblnews.org/microsoft-open-sourced-graphrag-python-library-to-extract-insights-from-text/

[7] https://neo4j.com/developer-blog/neo4j-genai-python-package-graphrag/

[8] https://github.com/stephenc222/example-graphrag

[9] https://github.com/hr1juldey/SimpleGRAPHRAG/tree/main


For more tips and strategies, connect with us on LinkedIn now.

Discover more AI resources on AI&U—click here to explore.

Learn DSPy: Analyze LinkedIn Posts with DSPy and Pandas

Unlock the Secrets of LinkedIn Posts with DSPy and Pandas

Social media is a goldmine of data, and LinkedIn is no exception. But how do you extract valuable insights from all those posts? This guide will show you how to leverage the power of DSPy and Pandas to analyze LinkedIn posts and uncover hidden trends.

In this blog post, you’ll learn:

How to use DSPy to programmatically analyze text data
How to leverage Pandas for data manipulation and cleaning
How to extract key insights from your LinkedIn posts using DSPy signatures
How to use emojis and hashtags to classify post types

Introduction

In today’s digital age, social media platforms like LinkedIn are treasure troves of data. Analyzing this data can help us understand trends, engagement, and the overall effectiveness of posts. In this guide, we will explore how to leverage two powerful tools—DSPy and Pandas—to analyze LinkedIn posts and extract valuable insights. Our goal is to provide a step-by-step approach that is easy to follow and understand, even for beginners.

What is Pandas?

Pandas is a widely-used data manipulation library in Python, essential for data analysis. It provides powerful data structures like DataFrames, which allow you to organize and manipulate data in a tabular format (think of it like a spreadsheet). With Pandas, you can perform operations such as filtering, grouping, and aggregating data.

Key Features of Pandas

  • DataFrame Structure: A DataFrame is a two-dimensional labeled data structure that can hold data of different types (like integers, floats, and strings).
  • Data Manipulation: Pandas makes it easy to clean and preprocess data, making it ready for analysis.
  • Integration with Other Libraries: It works well with other Python libraries, such as Matplotlib for visualization and NumPy for numerical operations.

For a foundational understanding of Pandas, check out Danielle B.’s Python Pandas Tutorial.

What is DSPy?

DSPy is a framework designed for programming language models (LMs) to optimize data analysis. Unlike traditional methods that rely heavily on prompting, DSPy enables users to structure data and model interactions more effectively, making it particularly useful for analyzing large datasets.

Key Features of DSPy

  • Prompt Programming: DSPy is a programming language designed to compile (and iteratively optimize) ideal prompts to achieve the desired output from a query.

  • High Reproducibility of Responses: When used with proper signatures and optimizers, DSPy can provide highly reliable and reproducible answers to your questions with zero—and I mean zero—hallucinations. We have tested DSPy over the last 21 days through various experiments 😎 with Mistral-Nemo as the LLM of choice, and it has either provided the correct answer or remained silent.

  • Model Interactions: Unlike most ChatGPT clones and AI tools that utilize OpenAI or other models in the backend, DSPy offers similar methods for using local or online API-based LLMs to perform tasks. You can even use GPT4o-mini as a manager or judge, local LLMs like phi3 as readers, and Mistral as writers. This allows you to create a complex system of LLMs and tasks, which in the field of Generative AI, we refer to as a Generative Feedback Loop (GFL).

  • Custom Dataset Loading: DSPy makes it easy to load and manipulate your own datasets or stream datasets from a remote or localhost server.

To get started with DSPy, visit the DSPy documentation, which includes detailed information on loading custom datasets.

Systematic Optimization

Choose from a range of optimizers to enhance your program. Whether generating refined instructions or fine-tuning weights, DSPy’s optimizers are engineered to maximize efficiency and effectiveness.

Modular Approach

With DSPy, you can build your system using predefined modules, replacing intricate prompting techniques with straightforward and effective solutions.

Cross-LM Compatibility

Whether you’re working with powerhouse models like GPT-3.5 or GPT-4, or local models such as T5-base or Llama2-13b, DSPy seamlessly integrates and enhances their performance within your system.

Citations:
[1] https://dspy-docs.vercel.app


Getting started with LinkedIn post data

There are web scraping tools online which are paid and free. You can use any one of them for educational purposes, as long as you don’t have personal data. For security reasons, though we will release the dataset, we have to refrain from revealing our sources.
the dataset we will be using is this Dataset.

Don’t try to open the dataset in excel or Google sheets, it might break!

open it in text editors or in Microsoft Datawrangler

Loading the data

To get started, follow these steps:

  1. Download the Dataset: Download the dataset from the link provided above.

  2. Set Up a Python Virtual Environment:

    • Open your terminal or command prompt.
    • Navigate to the directory or folder where you want to set up the virtual environment.
    • Create a virtual environment by running the following command:
      python -m venv myenv
    • Activate the virtual environment:
      • On Windows:
        myenv\Scripts\activate
      • On macOS/Linux:
        source myenv/bin/activate
  3. Create a Subfolder for the Data:

    • Inside your main directory, create a subfolder to hold the data. You can do this with the following command:
      mkdir data
  4. Create a Jupyter Notebook:

    • Install Jupyter Notebook if you haven’t already:
      pip install jupyter
    • Start Jupyter Notebook by running:
      jupyter notebook
    • In the Jupyter interface, create a new notebook in your desired directory.
  5. Follow Along: Use the notebook to analyze the dataset and perform your analysis.

By following these steps, you’ll be set up and ready to work with your dataset!

Checking the text length on the post

To gain some basic insights from the data we have, we will start by checking the length of the posts.


import pandas as pd
import os

def add_post_text_length(input_csv_path):
    # Read the CSV file into a DataFrame
    df = pd.read_csv(input_csv_path)

    # Check if 'Post Text' column exists
    if 'Post Text' not in df.columns:
        raise ValueError("The 'Post Text' column is missing from the input CSV file.")

    # Create a new column 'Post Text_len' with the length of 'Post Text'
    df['Post Text_len'] = df['Post Text'].apply(len)

    # Define the output CSV file path
    output_csv_path = os.path.join(os.path.dirname(input_csv_path), 'linkedin_posts_cleaned_An1.csv')

    # Write the modified DataFrame to a new CSV file
    df.to_csv(output_csv_path, index=False)

    print(f"New CSV file with post text lengths has been created at: {output_csv_path}")

# Example usage
input_csv = 'Your/directory/to/code/LinkedIn/pure _data/linkedin_posts_cleaned_o.csv'  # Replace with your actual CSV file path
add_post_text_length(input_csv)

Emoji classification

Social media is a fun space, and LinkedIn is no exception—emojis are a clear indication of that. Let’s explore how many people are using emojis and the frequency of their usage.


import pandas as pd
import emoji

# Load your dataset
df = pd.read_csv('Your/directory/to/code/LinkedIn/pure _data/linkedin_posts_cleaned_An1.csv') ### change them

# Create a new column to check for emojis
df['has_emoji'] = df['Post Text'].apply(lambda x: 'yes' if any(char in emoji.EMOJI_DATA for char in x) else 'no')

# Optionally, save the updated dataset
df.to_csv('Your/directory/to/code/LinkedIn/pure _data/linkedin_posts_cleaned_An2.csv', index=False) ### change them

The code above will perform a binary classification of posts, distinguishing between those that contain emojis and those that do not.

Quatitative classification of emojis

We will analyze the data on emojis, concentrating on their usage by examining different emoji types and their frequency of use.


import pandas as pd
import emoji
from collections import Counter

# Load the dataset
df = pd.read_csv('Your/directory/to/code/LinkedIn/pure _data/linkedin_posts_cleaned_An2.csv') ### change them

# Function to analyze emojis in the post text
def analyze_emojis(post_text):
    # Extract emojis from the text
    emojis_in_text = [char for char in post_text if char in emoji.EMOJI_DATA]

    # Count total number of emojis
    num_emojis = len(emojis_in_text)

    # Count frequency of each emoji
    emoji_counts = Counter(emojis_in_text)

    # Prepare lists of emojis and their frequencies
    emoji_list = list(emoji_counts.keys()) if emojis_in_text else ['N/A']
    frequency_list = list(emoji_counts.values()) if emojis_in_text else [0]

    return num_emojis, emoji_list, frequency_list

# Apply the function to the 'Post Text' column and assign results to new columns
df[['Num_emoji', 'Emoji_list', 'Emoji_frequency']] = df['Post Text'].apply(
    lambda x: pd.Series(analyze_emojis(x))
)

# Optionally, save the updated dataset
df.to_csv('Your/directory/to/code/LinkedIn/pure _data/linkedin_posts_cleaned_An3.csv', index=False) ### change them

# Display the updated DataFrame
print(df[['Serial Number', 'Post Text', 'Num_emoji', 'Emoji_list', 'Emoji_frequency']].head())

Hashtag classification

Hashtags are an important feature of online posts, as they provide valuable context about the content. Analyzing the hashtags in this dataset will help us conduct more effective Exploratory Data Analysis (EDA) in the upcoming steps.

Doing both binary classification of posts using hashtags and the hashtags that have been used


import pandas as pd
import re

# Load the dataset
df = pd.read_csv('Your/directory/to/code/DSPyW/LinkedIn/pure _data/linkedin_posts_cleaned_An3.csv')

# Function to check for hashtags and list them
def analyze_hashtags(post_text):
    # Find all hashtags in the post text using regex
    hashtags = re.findall(r'hashtag\s+#\s*(\w+)', post_text)

    # Check if any hashtags were found
    has_hashtags = 'yes' if hashtags else 'no'

    # Return the has_hashtags flag and the list of hashtags
    return has_hashtags, hashtags if hashtags else ['N/A']

# Apply the function to the 'Post Text' column and assign results to new columns
df[['Has_Hashtags', 'Hashtag_List']] = df['Post Text'].apply(
    lambda x: pd.Series(analyze_hashtags(x))
)

# Optionally, save the updated dataset
df.to_csv('Your/directory/to/code/DSPyW/LinkedIn/pure _data/linkedin_posts_cleaned_An4.csv', index=False)

# Display the updated DataFrame
print(df[['Serial Number', 'Post Text', 'Has_Hashtags', 'Hashtag_List']].head())

Prepare the dataset for dspy

DSPy loves datasets which are in a datastructure we call list of dictionaries. We will convert out datset into a list of dictionaries and learn to split it for testing and training in future experiments coming soon on AI&U


import pandas as pd
import dspy
from dspy.datasets.dataset import Dataset

class CSVDataset(Dataset):
    def __init__(self, file_path, train_size=5, dev_size=50, test_size=0, train_seed=1, eval_seed=2023) -> None:
        super().__init__()
        # define the inputs
        self.file_path=file_path
        self.train_size=train_size
        self.dev_size=dev_size
        self.test_size=test_size
        self.train_seed=train_seed
        #Just to have a default seed for future testing
        self.eval_seed=eval_seed
        # Load the CSV file into a DataFrame
        df = pd.read_csv(file_path)

        # Shuffle the DataFrame for randomness
        df = df.sample(frac=1, random_state=train_seed).reset_index(drop=True)

        # Split the DataFrame into train, dev, and test sets
        self._train = df.iloc[:train_size].to_dict(orient='records')  # Training data
        self._dev = df.iloc[train_size:train_size + dev_size].to_dict(orient='records')  # Development data
        self._test = df.iloc[train_size + dev_size:train_size + dev_size + test_size].to_dict(orient='records')  # Testing data (if any)

# Example usage
# filepath
filepath='Your/directory/to/code/DSPyW/LinkedIn/pure _data/linkedin_posts_cleaned_An4.csv' # change it
# Create an instance of the CSVDataset
dataset = CSVDataset(file_path=filepath,train_size=200, dev_size=200, test_size=1100, train_seed=64, eval_seed=2023)

# Accessing the datasets
train_data = dataset._train
dev_data = dataset._dev
test_data = dataset._test

# Print the number of samples in each dataset
print(f"Number of training samples: {len(train_data)}, \n\n--- sample: {train_data[0]['Post Text'][:300]}") ### showing post text till 30 characters
print(f"Number of development samples: {len(dev_data)}")
print(f"Number of testing samples: {len(test_data)}")

Setting up LLMs for inference

We are using **mistral-nemo:latest**, as a strong local LLM for inference, as it can run on most gaming laptops and it has performed reliabliy on our experiments for the last few weeks.

Mistral NeMo is a state-of-the-art language model developed through a collaboration between Mistral AI and NVIDIA. It features 12 billion parameters and is designed to excel in various tasks such as reasoning, world knowledge application, and coding accuracy. Here are some key aspects of Mistral NeMo:

Key Features

  • Large Context Window: Mistral NeMo can handle a context length of up to 128,000 tokens, allowing it to process long-form content and complex documents effectively [1], [2].

  • Performance: This model is noted for its advanced reasoning capabilities and exceptional accuracy in coding tasks, outperforming other models of similar size, such as Gemma 2 and Llama 3, in various benchmarks[2],[3].

  • Multilingual Support: Mistral NeMo supports a wide range of languages, including English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi, making it versatile for global applications[2], [3].

  • Tokenizer: It utilizes a new tokenizer called Tekken, which is more efficient in compressing natural language text and source code compared to previous models. This tokenizer enhances performance across multiple languages [2], [3].

  • Integration and Adaptability: Mistral NeMo is built on a standard architecture that allows it to be easily integrated into existing systems as a drop-in replacement for earlier models like Mistral 7B [1], [2].

  • Fine-tuning and Alignment: The model has undergone advanced fine-tuning to enhance its ability to follow instructions and engage in multi-turn conversations, making it suitable for interactive applications[2], [3].

Mistral NeMo is released under the Apache 2.0 license, promoting its adoption for both research and enterprise use.


import dspy
# Define the languge Model 
olm=dspy.OpenAI(api_base="http://localhost:11434/v1/", api_key="ollama", model="mistral-nemo:latest", stop='\n\n', model_type='chat')
dspy.settings.configure(lm=olm)

Using DSPy Signatures to Contextualize and Classify LinkedIn Posts

we are using hashtags and emojis as guides to classify the the posts done on LinkedIn.
While hashtags being strings of text we know that they can act as good hints.
But we also want to check if emojis are also powerful features in finding context.
there will be a final dataset that will have these classifications and contexts
in some future experiments we will explore the correctness and ways to achive correctness in predicting the context and classification with High accuracy


import dspy

# Define the signature for the model
class PostContext(dspy.Signature):
    """Summarize the LinkedIn post context in 15 words and classify it into the type of post."""
    post_text = dspy.InputField(desc="Can be a social media post about a topic ignore all occurances of \n, \n\n, \n\n\n ")
    emoji_hint = dspy.InputField(desc="is a list of emojis that can be in the post_text")
    hashtag_hint = dspy.InputField(desc="is a list of hashtags like 'hashtag\s+#\s*(\w+)' that gives a hint on main topic")
    context = dspy.OutputField(desc=f"Generate a 10 word faithful summary that describes the context of the {post_text} using {hashtag_hint} and {emoji_hint}")
    classify=dspy.OutputField(desc=f"Classify the subject of {post_text} using {context} as hint, ONLY GIVE 20 Word CLASSIFICATION, DON'T give Summary")

# Select only the desired keys for 
selected_keys = ['Post Text','Post Text_len','has_emoji','Num_emoji','Emoji_list','Emoji_frequency','Has_Hashtags', 'Hashtag_List']

# Prepare trainset and devset for DSPy
trainset = [{key: item[key] for key in selected_keys if key in item} for item in train_data]
devset = [{key: item[key] for key in selected_keys if key in item} for item in dev_data]
testset=[{key: item[key] for key in selected_keys if key in item} for item in test_data]

# Print lengths of the prepared datasets
#print(f"Length of trainset: {len(trainset)}")
#print(f"Length of devset: {len(devset)}")

# Define the languge Model 
olm=dspy.OpenAI(api_base="http://localhost:11434/v1/", api_key="ollama", model="mistral-nemo:latest", stop='\n\n', model_type='chat')
dspy.settings.configure(lm=olm)
# Initialize the ChainOfThoughtWithHint model
predict_context=dspy.ChainOfThoughtWithHint(PostContext)
# Example prediction for the first post in the dev set
if devset:
    example_post = devset[5]
    prediction = predict_context(
        post_text=example_post['Post Text'],
        emoji_hint=example_post['Emoji_list'],
        hashtag_hint=example_post['Hashtag_List']
    )
    print(f"Predicted Context for the example post:\n{prediction.context}\n\n the type of post can be classified as:\n\n {prediction.classify} \n\n---- And the post is:\n {example_post['Post Text'][:300]} \n\n...... ")
    #print(example_post['Post Text_len'])

Now we will move onto creating the context and classification for the dataset

Make a subset of data with that has Hashtags and emojis that can be used for faithful classification and test if the model is working or not.


# Define the languge Model 
olm=dspy.OpenAI(api_base="http://localhost:11434/v1/", api_key="ollama", model="mistral-nemo:latest", stop='\n\n', model_type='chat')
dspy.settings.configure(lm=olm)
# Initialize the ChainOfThoughtWithHint model
predict_context_with_hint=dspy.ChainOfThoughtWithHint(PostContext)

for i in range(len(trainset)):
    if trainset[i]["Post Text_len"]<1700 and trainset[i]["Has_Hashtags"]== "yes":
        ideal_post=trainset[i]
        prediction = predict_context_with_hint(
        post_text=ideal_post['Post Text'],
        emoji_hint=ideal_post['Emoji_list'],
        hashtag_hint=ideal_post['Hashtag_List']
    )
        print(f"The predicted Context is:\n\n {prediction.context}\n\n And the type of post is:\n\n {prediction.classify}\n\n-----")
    else:
        continue

write down the subset in a new version of the input csv file with context and classification

now that we have the classified and contextualized the data in the post we can store the data in a new csv


import pandas as pd
import dspy
import os

# Define the language Model
olm = dspy.OpenAI(api_base="http://localhost:11434/v1/", api_key="ollama", model="mistral-nemo:latest", stop=&#039;\n\n&#039;, model_type=&#039;chat&#039;)
dspy.settings.configure(lm=olm)

# Initialize the ChainOfThoughtWithHint model
predict_context_with_hint = dspy.ChainOfThoughtWithHint(PostContext)

def process_csv(input_csv_path):
    # Read the CSV file into a DataFrame
    df = pd.read_csv(input_csv_path)

    # Check if necessary columns exist
    if &#039;Post Text&#039; not in df.columns or &#039;Post Text_len&#039; not in df.columns or &#039;Has_Hashtags&#039; not in df.columns:
        raise ValueError("The input CSV must contain &#039;Post Text&#039;, &#039;Post Text_len&#039;, and &#039;Has_Hashtags&#039; columns.")

    # Create new columns for predictions
    df[&#039;Predicted_Context&#039;] = None
    df[&#039;Predicted_Post_Type&#039;] = None

    # Iterate over the DataFrame rows
    for index, row in df.iterrows():
        if row["Post Text_len"] < 1600 and row["Has_Hashtags"] == "yes":
            prediction = predict_context_with_hint(
                post_text=row[&#039;Post Text&#039;],
                emoji_hint=row[&#039;Emoji_list&#039;],
                hashtag_hint=row[&#039;Hashtag_List&#039;]
            )
            df.at[index, &#039;Predicted_Context&#039;] = prediction.context
            df.at[index, &#039;Predicted_Post_Type&#039;] = prediction.classify

    # Define the output CSV file path
    output_csv_path = os.path.join(os.path.dirname(input_csv_path), &#039;LinkedIn_data_final_output.csv&#039;)

    # Write the modified DataFrame to a new CSV file
    df.to_csv(output_csv_path, index=False)

    print(f"New CSV file with predictions has been created at: {output_csv_path}")

# Example usage
input_csv = &#039;Your/directory/to/code/DSPyW/LinkedIn/pure _data/linkedin_posts_cleaned_An4.csv&#039;  # Replace with your actual CSV file path
process_csv(input_csv)

Conclusion

Combining DSPy with Pandas provides a robust framework for extracting insights from LinkedIn posts. By following the outlined steps, you can effectively analyze data, visualize trends, and derive meaningful conclusions. This guide serves as a foundational entry point for those interested in leveraging data science tools to enhance their understanding of social media dynamics.

By utilizing the resources and coding examples provided, you can gain valuable insights from your LinkedIn posts and apply these techniques to other datasets for broader applications in data analysis. Start experimenting with your own LinkedIn data today and discover the insights waiting to be uncovered!


This guide is designed to be engaging and informative, ensuring that readers, regardless of their experience level, can follow along and gain valuable insights from their LinkedIn posts. Happy analyzing!

References

  1. Danielle B.’s Post – Python pandas tutorial – LinkedIn 🐼💻 Excited to share some insights into using pandas for data analysis in Py…
  2. Unlocking the Power of Data Science with DSPy: Your Gateway to AI … Our YouTube channel, “DSPy: Data Science and AI Mastery,” is your ultimate …
  3. Creating a Custom Dataset – DSPy To create a list of Example objects, we can simply load data from the source and…
  4. Models Don’t Matter: Building Compound AI Systems with DSPy and … To get started, we’ll install the DSPy library, set up the DBRX fo…
  5. A Step-by-Step Guide to Data Analysis with Pandas and NumPy In this blog post, we will walk through a step-by-step guide on h…
  6. DSPy: The framework for programming—not prompting—foundation … DSPy is a framework for algorithmically optimizing LM prom…
  7. An Exploratory Tour of DSPy: A Framework for Programing … – Medium An Exploratory Tour of DSPy: A Framework for Programing Language M…
  8. Inside DSPy: The New Language Model Programming Framework … The DSPy compiler methodically traces the program’…
  9. Leann Chen on LinkedIn: #rag #knowledgegraphs #dspy #diffbot We designed a custom DSPy pipeline integrating with knowledge graphs. The …
  10. What’s the best way to use Pandas in Program of Thought #1004 I want to build an agent to answer questions using…


    Let’s take this conversation further—join us on LinkedIn here.

    Want more in-depth analysis? Head over to AI&U today.

Learning DSPy:Optimizing Question Answering of Local LLMs

Revolutionize AI!
Master question-answering with Mistral NeMo, a powerful LLM, alongside Ollama and DSPy. This post explores optimizing ReAct agents for complex tasks using Mistral NeMo’s capabilities and DSPy’s optimization tools. Unlock the Potential of Local LLMs: Craft intelligent AI systems that understand human needs. Leverage Mistral NeMo for its reasoning and context window to tackle intricate queries. Embrace the Future of AI Development: Start building optimized agents today! Follow our guide and code examples to harness the power of Mistral NeMo, Ollama, and DSPy.

Learning DSPy with Ollama and Mistral-NeMo

In the realm of artificial intelligence, the ability to process and understand human language is paramount. One of the most promising advancements in this area is the emergence of large language models like Mistral NeMo, which excel at complex tasks such as question answering. This blog post will explore how to optimize the performance of a ReAct agent using Mistral NeMo in conjunction with Ollama and DSPy. For further insights into the evolving landscape of AI and the significance of frameworks like DSPy, check out our previous blog discussing the future of prompt engineering here.

What is Mistral NeMo?

Mistral NeMo is a state-of-the-art language model developed in partnership with NVIDIA. With 12 billion parameters, it offers impressive capabilities in reasoning, world knowledge, and coding accuracy. One of its standout features is its large context window, which can handle up to 128,000 tokens of text—this allows it to process and understand long passages, making it particularly useful for complex queries and dialogues (NVIDIA).

Key Features of Mistral NeMo

  1. Large Context Window: This allows Mistral NeMo to analyze and respond to extensive texts, accommodating intricate questions and discussions.
  2. State-of-the-Art Performance: The model excels in reasoning tasks, providing accurate and relevant answers.
  3. Collaboration with NVIDIA: By leveraging NVIDIA’s advanced technology, Mistral NeMo incorporates optimizations that enhance its performance.

Challenges in Optimization

While Mistral NeMo is a powerful tool, there are challenges when it comes to optimizing and fine-tuning ReAct agents. One significant issue is that the current documentation does not provide clear guidelines on implementing few-shot learning techniques effectively. This can affect the adaptability and overall performance of the agent in real-world applications (Hugging Face).

What is a ReAct Agent?

Before diving deeper, let’s clarify what a ReAct agent is. ReAct, short for "Reasoning and Acting," refers to AI systems designed to interact with users by answering questions and performing tasks based on user input. These agents can be applied in various fields, from customer service to educational tools (OpenAI).

Integrating DSPy for Optimization

To overcome the challenges mentioned above, we can use DSPy, a framework specifically designed to optimize ReAct agents. Here are some of the key functionalities DSPy offers:

  • Simulating Traces: This feature allows developers to inspect data and simulate traces through the program, helping to generate both good and bad examples.
  • Refining Instructions: DSPy can propose or refine instructions based on performance feedback, making it easier to improve the agent’s effectiveness.

Setting Up a ReAct Agent with Mistral NeMo and DSPy

Now that we have a good understanding of Mistral NeMo and DSPy, let’s look at how to set up a simple ReAct agent using these technologies. Below, you’ll find a code example that illustrates how to initialize the Mistral NeMo model through Ollama and optimize it using DSPy.

Code Example

Here’s a sample code that Uses a dataset called HotPotQA and ColBertV2 a Dataset Retrieval model to test and optimise a ReAct Agent that is using mistral-nemo-latest as the llm

Step-by-Step Breakdown of the Code

1. Importing Libraries configuring Datasets:

First We will import DSpy libraries evaluate,datasets,teleprompt.
The first one is used to check the performance of a dspy agent.
The second one is used to load inbuilt datasets to evaluate the performance of the LLms
The third one is used as an optimisation framework for training and tuning the prompts that are provided to the LLMs



import dspy
from dspy.evaluate import Evaluate
from dspy.datasets.hotpotqa import HotPotQA
from dspy.teleprompt import BootstrapFewShotWithRandomSearch

ollama=dspy.OllamaLocal(model='mistral-nemo:latest')
colbert = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
dspy.configure(lm=ollama, rm=colbert)

2. Loading some data:

We will now load the Data and segment to into training data, testing data and development data



dataset = HotPotQA(train_seed=1, train_size=200, eval_seed=2023, dev_size=300, test_size=0)
trainset = [x.with_inputs('question') for x in dataset.train[0:150]]
valset = [x.with_inputs('question') for x in dataset.train[150:200]]
devset = [x.with_inputs('question') for x in dataset.dev]

# show an example datapoint; it's just a question-answer pair
trainset[23]

3. Creating a ReAct Agent:

First we will make a default (Dumb 😂) ReAct agent


agent = dspy.ReAct("question -> answer", tools=[dspy.Retrieve(k=1)])

4. Evaluting the agent:

Set up an evaluator on the first 300 examples of the devset.


config = dict(num_threads=8, display_progress=True, display_table=25)
evaluate = Evaluate(devset=devset, metric=dspy.evaluate.answer_exact_match, **config)

evaluate(agent)

5. Optimizing the ReAct Agent:

Now we will (try to) put some brains into the dumb agent by training it


config = dict(max_bootstrapped_demos=2, max_labeled_demos=0, num_candidate_programs=5, num_threads=8)
tp = BootstrapFewShotWithRandomSearch(metric=dspy.evaluate.answer_exact_match, **config)
optimized_react = tp.compile(agent, trainset=trainset, valset=valset)

6. Testing the Agent:

Now we will check if the agents have become smart (enough)


evaluate(optimized_react)

Conclusion

Integrating MistralNeMo with Ollama and DSPy presents a powerful framework for developing and optimizing question-answering ReAct agents. By leveraging the model’s extensive capabilities, including its large context window tool calling capabilities and advanced reasoning skills, developers can create AI agents that efficiently handle complex queries with high accuracy in a local setting.

However, it’s essential to address the gaps in current documentation regarding optimization techniques for Local and opensource models and agents. By understanding these challenges and utilizing tools like DSPy, developers can significantly enhance the performance of their AI projects.

As AI continues to evolve, the integration of locally running models like Mistral NeMo will play a crucial role in creating intelligent systems capable of understanding and responding to human needs. With the right tools and strategies, developers can harness the full potential of these technologies, ultimately leading to more sophisticated and effective AI applications.

By following the guidance provided in this blog post, you can start creating your own optimized question-answering agents using Mistral NeMo, Ollama, and DSPy. Happy coding!

References

  1. Creating ReAct AI Agents with Mistral-7B/Mixtral and Ollama using … Creating ReAct AI Agents with Mistral-7B/Mixtral a…
  2. Mistral NeMo – Hacker News Mistral NeMo offers a large context window of up to 128k tokens. Its reasoning, …

  3. Lack of Guidance on Optimizing/Finetuning ReAct Agent with Few … The current ReAct documentation lacks clear instructions on optimizing or fine…

  4. Introducing Mistral NeMo – Medium Mistral NeMo is an advanced 12 billion parameter model developed in co…

  5. Optimizing Multi-Agent Systems with Mistral Large, Nemo … – Zilliz Agents can handle complex tasks with minimal human intervention. Learn how to bu…

  6. mistral-nemo – Ollama Mistral NeMo is a 12B model built in collaboration with NVIDIA. Mistra…
  7. Mistral NeMo : THIS IS THE BEST LLM Right Now! (Fully … – YouTube … performance loss. Multilingual Support: The new Tekken t…

  8. dspy/README.md at main · stanfordnlp/dspy – GitHub Current DSPy optimizers can inspect your data, simulate traces …

  9. Is Prompt Engineering Dead? DSPy Says Yes! AI&U


    Your thoughts matter—share them with us on LinkedIn here.

    Want the latest updates? Visit AI&U for more in-depth articles now.


## Declaration:

### The whole blog itself is written using Ollama, CrewAi and DSpy

👀

Is Prompt Engineering Dead? DSPy Says Yes!

DSPy,
a new programming framework, is revolutionizing how we interact with language models. Unlike traditional manual prompting, DSPy offers a systematic approach that enhances reliability and flexibility. By focusing on what you want to achieve, DSPy simplifies development and allows for more robust applications. This open-source Python framework is ideal for chatbots, recommendation systems, and other AI-driven tasks. Try DSPy today and experience the future of AI programming.

Introduction to DSPy: The Prompt Progamming Language

In the world of technology, programming languages and frameworks are the backbone of creating applications that help us in our daily lives. One of the exciting new developments in this area is DSPy, a programming framework that promises to revolutionize how we interact with language models and retrieval systems. In this blog post, we will explore what DSPy is, its advantages, the modular design it employs, and how it embraces a declarative programming style. We will also look at some practical use cases, and I’ll provide you with a simple code example to illustrate how DSPy works.

What is DSPy?

DSPy, short for "Declarative Systems for Prompting," is an open-source Python framework designed to simplify the development of applications that utilize language models (LMs) and retrieval models (RMs). Unlike traditional methods that rely heavily on manually crafted prompts to get responses from language models, DSPy shifts the focus to systematic programming.

Why DSPy Matters

Language models like GPT-3, llama3.1 and others have become incredibly powerful tools for generating human-like text. However, using them effectively can often feel like a trial-and-error process. Developers frequently find themselves tweaking prompts endlessly, trying to coax the desired responses from these models. This approach can lead to inconsistent results and can be quite fragile, especially when dealing with complex applications.

DSPy addresses these issues by providing a framework that promotes reliability and flexibility. It allows developers to create applications that can adapt to different inputs and requirements, enhancing the overall user experience.

Purpose and Advantages of DSPy

1. Enhancing Reliability

One of the main goals of DSPy is to tackle the fragility commonly associated with language model applications. By moving away from a manual prompting approach, DSPy enables developers to build applications that are more robust. This is achieved through systematic programming that reduces the chances of errors and inconsistencies.

2. Streamlined Development Process

With DSPy, developers can focus on what they want to achieve rather than getting bogged down in how to achieve it. This shift in focus simplifies the development process, making it easier for both experienced and novice programmers to create effective applications.

3. Modular Design

DSPy promotes a modular design, allowing developers to construct pipelines that can easily integrate various language models and retrieval systems. This modularity enhances the maintainability and scalability of applications. Developers can build components that can be reused and updated independently, making it easier to adapt to changing requirements.

Declarative Programming: A New Approach

One of the standout features of DSPy is its support for declarative programming. This programming style allows developers to specify what they want to achieve without detailing how to do it. For example, instead of writing out every step of a process, a developer can express the desired outcome, and the framework handles the underlying complexity.

Benefits of Declarative Programming

  • Simplicity: By abstracting complex processes, developers can focus on higher-level logic.
  • Readability: Code written in a declarative style is often easier to read and understand, making it accessible to a broader audience.
  • Maintainability: Changes can be made more easily without needing to rework intricate procedural code.

Use Cases for DSPy

DSPy is particularly useful for applications that require dynamic adjustments based on user input or contextual changes. Here are a few examples of where DSPy can shine:

1. Chatbots

Imagine a chatbot that can respond to user queries in real-time. With DSPy, developers can create chatbots that adapt their responses based on the conversation\’s context, leading to more natural and engaging interactions.

2. Recommendation Systems

Recommendation systems are crucial for platforms like Netflix and Amazon, helping users discover content they might enjoy. DSPy can help build systems that adjust recommendations based on user behavior and preferences, making them more effective.

3. AI-driven Applications

Any application that relies on natural language processing can benefit from DSPy. From summarizing articles to generating reports, DSPy provides a framework that can handle various tasks efficiently.

Code Example: Getting Started with DSPy

To give you a clearer picture of how DSPy works, let’s look at a simple code example. This snippet demonstrates the basic syntax and structure of a DSPy program.If you have Ollama running in your PC (Check this guide) even you can run the code, Just change the LLM in the variable model to the any one LLM you have.

To know what LLM you have to to terminal type ollama serve.

Then open another terminal type ollama list.

Let\’s jump into the code example:

# install DSPy: pip install dspy
import dspy

# Ollam is now compatible with OpenAI APIs
# 
# To get this to work you must include model_type='chat' in the dspy.OpenAI call. 
# If you do not include this you will get an error. 
# 
# I have also found that stop='\n\n' is required to get the model to stop generating text after the ansewr is complete. 
# At least with mistral.

ollama_model = dspy.OpenAI(api_base='http://localhost:11434/v1/', api_key='ollama', model='crewai-llama3.1:latest', stop='\n\n', model_type='chat')

# This sets the language model for DSPy.
dspy.settings.configure(lm=ollama_model)

# This is not required but it helps to understand what is happening
my_example = {
    question: What game was Super Mario Bros. 2 based on?,
    answer: Doki Doki Panic,
}

# This is the signature for the predictor. It is a simple question and answer model.
class BasicQA(dspy.Signature):
    Answer questions about classic video games.

    question = dspy.InputField(desc=a question about classic video games)
    answer = dspy.OutputField(desc=often between 1 and 5 words)

# Define the predictor.
generate_answer = dspy.Predict(BasicQA)

# Call the predictor on a particular input.
pred = generate_answer(question=my_example['question'])

# Print the answer...profit :)
print(pred.answer)

Understanding DSPy Code Step by Step

Step 1: Installing DSPy

Before we can use DSPy, we need to install it. We do this using a command in the terminal (or command prompt):

pip install dspy

What This Does:

  • pip is a tool that helps you install packages (like DSPy) that you can use in your Python programs.

  • install dspy tells pip to get the DSPy package from the internet.


Step 2: Importing DSPy

Next, we need to bring DSPy into our Python program so we can use it:

import dspy

What This Does:

  • import dspy means we want to use everything that DSPy offers in our code.


Step 3: Setting Up the Model

Now we need to set up the language model we want to use. This is where we connect to a special service (Ollama) that helps us generate answers:

ollama_model = dspy.OpenAI(api_base='http://localhost:11434/v1/', api_key='ollama', model='crewai-llama3.1:latest', stop='\n\n', model_type='chat')

What This Does:

  • dspy.OpenAI(...) is how we tell DSPy to use the OpenAI service.

  • api_base is the address where the service is running.

  • api_key is like a password that lets us use the service.

  • model tells DSPy which specific AI model to use.

  • stop='\n\n' tells the model when to stop generating text (after it finishes answering).

  • model_type='chat' specifies that we want to use a chat-like model.


Step 4: Configuring DSPy Settings

Now we set DSPy to use our model:

dspy.settings.configure(lm=ollama_model)

What This Does:

  • This line tells DSPy to use the ollama_model we just set up for generating answers.


Step 5: Creating an Example

We create a simple example to understand how our question and answer system will work:

my_example = {
    question: What game was Super Mario Bros. 2 based on?,
    answer: Doki Doki Panic,
}

What This Does:

  • my_example is a dictionary (like a box that holds related information) with a question and its answer.


Step 6: Defining the Question and Answer Model

Next, we define a class that describes what our question and answer system looks like:

class BasicQA(dspy.Signature):
    Answer questions about classic video games.

    question = dspy.InputField(desc=a question about classic video games)
    answer = dspy.OutputField(desc=often between 1 and 5 words)

What This Does:

  • class BasicQA(dspy.Signature): creates a new type of object that can handle questions and answers.

  • question is where we input our question.

  • answer is where we get the answer back.

  • The desc tells us what kind of information we should put in or expect.


Step 7: Creating the Predictor

Now we create a predictor that will help us generate answers based on our questions:

generate_answer = dspy.Predict(BasicQA)

What This Does:

  • dspy.Predict(BasicQA) creates a function that can take a question and give us an answer based on the BasicQA model we defined.


Step 8: Getting an Answer

Now we can use our predictor to get an answer to our question:

pred = generate_answer(question=my_example['question'])

What This Does:

  • We call generate_answer with our example question, and it will return an answer, which we store in pred.


Step 9: Printing the Answer

Finally, we print out the answer we got:

print(pred.answer)

What This Does:

  • This line shows the answer generated by our predictor on the screen.


Summary

In summary, this code sets up a simple question-and-answer system using DSPy and a language model. Here’s what we did:

  1. Installed DSPy: We got the package we need.
  2. Imported DSPy: We brought it into our code.
  3. Set Up the Model: We connected to the AI model.
  4. Configured DSPy: We told DSPy to use our model.
  5. Created an Example: We made a sample question and answer.
  6. Defined the Model: We explained how our question and answer system works.
  7. Created the Predictor: We made a function to generate answers.
  8. Got an Answer: We asked our question and got an answer.
  9. Printed the Answer: We showed the answer on the screen.

Now you can ask questions about classic films and video games and get answers using this code! To know how, wait for the next part of the blog

Interesting Facts about DSPy

  • Developed by Experts: DSPy was developed by researchers at Stanford University, showcasing a commitment to improving the usability of language models in real-world applications.
  • User-Friendly Design: The framework is designed to be accessible, catering to developers with varying levels of experience in AI and machine learning.
  • Not Just About Prompts: DSPy emphasizes the need for systematic approaches that can lead to better performance and user experience, moving beyond just replacing hard-coded prompts.

Conclusion

In conclusion, DSPy represents a significant advancement in how developers can interact with language models. By embracing programming over manual prompting, DSPy opens up new possibilities for building sophisticated AI applications that are both flexible and reliable. Its modular design, support for declarative programming, and focus on enhancing reliability make it a valuable tool for developers looking to leverage the power of language models in their applications.

Whether you\’re creating a chatbot, a recommendation system, or any other AI-driven application, DSPy provides the framework you need to streamline your development process and improve user interactions. As the landscape of AI continues to evolve, tools like DSPy will be essential for making the most of these powerful technologies.

With DSPy, the future of programming with language models looks promising, and we can’t wait to see the innovative applications that developers will create using this groundbreaking framework. So why not give DSPy a try and see how it can transform your approach to building AI applications?

References

  1. dspy/intro.ipynb at main · stanfordnlp/dspy – GitHub This notebook introduces the DSPy framework for Programming with Foundation Mode…
  2. An Introduction To DSPy – Cobus Greyling – Medium DSPy is designed for scenarios where you require a lightweight, self-o…
  3. DSPy: The framework for programming—not prompting—foundation … DSPy is a framework for algorithmically optimizing LM prompts and weig…
  4. Intro to DSPy: Goodbye Prompting, Hello Programming! – YouTube … programming-4ca1c6ce3eb9 Source Code: Coming Soon. ……
  5. An Exploratory Tour of DSPy: A Framework for Programing … – Medium In this article, I examine what\’s about DSPy that is promisi…
  6. A gentle introduction to DSPy – LearnByBuilding.AI This blog post provides a comprehensive introduction to DSPy, focu…
  7. What Is DSPy? How It Works, Use Cases, and Resources – DataCamp DSPy is an open-source Python framework that allows developers…
  8. Who is using DSPy? : r/LocalLLaMA – Reddit DSPy does not do any magic with the language model. It just uses a bunch of prom…
  9. Intro to DSPy: Goodbye Prompting, Hello Programming! DSPy [1] is a framework that aims to solve the fragility problem in la…
  10. Goodbye Manual Prompting, Hello Programming With DSPy The DSPy framework aims to resolve consistency and reliability issues by prior…

Expand your professional network—let’s connect on LinkedIn today!

Enhance your AI knowledge with AI&U—visit our website here.


Declaration: the whole blog itself is written using Ollama, CrewAi and DSpy 👀

@keyframes blink {
    0%, 100% { opacity: 1; }
    50% { opacity: 0; }
}

Exit mobile version