Shakespeare and the Bible: An AI Investigation

Summary

Could the greatest playwright of all time have secretly shaped one of the most influential religious texts in history? Some believe William Shakespeare left his mark on the King James Bible hidden in plain sight. With the power of AI, we’ll investigate whether there’s any truth to this conspiracy.

You can read about the conspiracy here:

Summary

Vector databases are essential for modern AI applications like semantic search, recommendation systems, and natural language processing. They allow us to store and query high-dimensional vectors efficiently. With the pgvector extension PostgreSQL becomes a powerful vector database, enabling you to combine traditional relational data with vector-based operations.

In this post, we will walk through the full process:

Installing PostgreSQL and pgvector Setting up a vector-enabled database Generating embeddings using Ollama Running similarity queries with Python By the end, you’ll be able to store, query, and compare high-dimensional vectors in PostgreSQL, opening up new possibilities for AI-powered applications.

Summary

Function calling allows Large Language Models (LLMs) to interact with APIs, databases, and other tools, making them more than just text generators.

Integrating LLMs with functions enables you to harness their powerful text processing capabilities, seamlessly enhancing the technological solutions you develop.

This post will explain how you can call local python functions and tools in Ollama.

Introduction to Ollama Function Calling

Ollama allows you to run state-of-the-art LLMs like Qwen, Llama, and others locally without relying on cloud APIs. Its function-calling feature enables models to execute external Python functions, making it ideal for applications like chatbots, automation tools, and data-driven systems.

Summary

In this post, I will demonstrate how to set up and use haystack with Ollama.

haystack is a framework that helps when building applications powered by LLMs.

It offers extensive LLM-related functionality.
It is open source under the Apache license.
It is actively developed, with numerous contributors.
It is widely used in production by various clients.

These are some of the key items to watch for when using a library in a project.

Summary

This is another component in SmartAnswer and enhanced LLM interface.

In this blog post, we introduce a wrapper class, FaissDB, which integrates FAISS with SQLite or any database to manage document embeddings and enable efficient similarity search. This approach combines FAISS’s vector search capabilities with the storage and querying power of a database, making it ideal for applications such as Retrieval-Augmented Generation (RAG) and recommendation systems.

It builds up this tool PaperSearch.

Summary

This is part on in a series of blog post working towards SmartAnswer a comprehensive improvement to how Large Language Models LLMs answer questions.

This tool will be the source of data for SmartAnswer and allow it to find and research better data when generating answers.

I want this tool to be included in that solution but I dot want all the code from this tool distracting from the SmartAnswer solution. Hence this post.

Summary

In this post, we’ll build a Retrieval Augmented Generation (RAG) tool to process the PDF files downloaded from arXiv in the previous post DeepResearch Part 1. This RAG tool will be capable of loading, processing, and semantically searching the document content. It’s a versatile tool applicable to various text sources, including web pages.

Building the RAG Tool

Following up on our arXiv downloader, we now need a tool to process the downloaded PDF’s. This post details the creation of such a tool.

Summary

This post demonstrates how to automatically transform a scientific paper (or any text/audio content) into a YouTube video using AI. We’ll leverage several powerful tools, including large language models (LLMs), Whisper for transcription, Stable Diffusion for image generation, and FFmpeg for video assembly. This process can streamline content creation and make research more accessible.

Overview

Our pipeline involves these steps:

Audio Generation (Optional): If starting from a text document, we’ll use a text-to-speech service (like NotebookLM, or others) to create an audio narration.
Transcription: We’ll use Whisper to transcribe the audio into text, including timestamps for each segment.
Database Storage: The transcribed text, timestamps, and metadata will be stored in an SQLite database for easy management.
Text Chunking: We’ll divide the transcript into logical chunks (e.g., by sentence or time duration).
Concept Summarization: An LLM will summarize the core concept of each chunk.
Image Prompt Generation: Another LLM will create a detailed image prompt based on the summary.
Image Generation: Stable Diffusion (or a similar tool) will generate images from the prompts.
Video Assembly: FFmpeg will combine the images and audio into a final video.

Prerequisites

Hugging Face CLI: Install it to download the Whisper model: pip install huggingface_hub
Whisper: Install the whisper-timestamped package, or your preferred Whisper implementation.
Ollama: You’ll need a running instance of Ollama to access the LLMs.
Stable Diffusion WebUI (or similar): For image generation.
FFmpeg: For video and audio processing. Ensure it’s in your system’s PATH.
Python Libraries: Install necessary Python packages: pip install pydub sqlite3 requests Pillow (and any others as needed).

**1️⃣ Audio Generation (Optional)

If you’re starting with a text document, you’ll need to convert it to audio. Several cloud services and libraries can do this. For this example, we’ll assume you have an audio file (audio.wav).

Using Ollama

Introduction

Ollama is the best platform for running, managing, and interacting with Large Language Models (LLM) models locally. For Python programmers, Ollama offers seamless integration and robust features for querying, manipulating, and deploying LLMs. In this post I will explore how Python developers can leverage Ollama for powerful and efficient AI-based workflows.

1️⃣ What is Ollama?

Ollama is a tool designed to enable local hosting and interaction with LLMs. Unlike cloud-based APIs, Ollama prioritizes privacy and speed by running models directly on your machine. Key benefits include:

Ollama

Shakespeare and the Bible: An AI Investigation

Summary

PostgreSQL for AI: Storing and Searching Embeddings with pgvector

Summary

Beyond Text Generation: Coding Ollama Function Calls and Tools

Summary

Introduction to Ollama Function Calling

Building AI-Powered Applications with Haystack and Ollama

Summary

Efficient Similarity Search with FAISS and SQLite in Python

Summary

Automating Paper Retrieval and Processing with PaperSearch

Summary

DeepResearch Part 2: Building a RAG Tool for arXiv PDFs

Summary

Building the RAG Tool

Creating AI-Powered Paper Videos: From Research to YouTube

Summary

Overview

Prerequisites

**1️⃣ Audio Generation (Optional)

Ollama: The local LLM solution

Using Ollama

Introduction

1️⃣ What is Ollama?