Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it!
-
Updated
Nov 30, 2025 - Rust
Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it!
Big Data and Machine Intelligence Course in Autumn 2019.
Patient Intake Form Extraction using llm
🧠 Solana DEX Swap Data Indexer Substream-powered swap indexer for Solana — supports Pump.fun, PumpSwap, BonkFun, Meteora, Raydium, Orca & more. ⚡📊🔥 Designed for real-time trade analytics, MEV research, and on-chain data pipelines. 📡
🌲 Improved Interval B+ tree implementation, in TS 🌲
This repository contains an application designed to recommend scientific papers that are most similar to a given input paragraph. The application uses the llama and weaviate libraries to achieve this.
A zero-dependency library of classes that make filtering, sorting and observing changes to arrays easier and more efficient.
Datafast Runtime is a high-performance subgraph processing runtime which is written from scratch and designed to handle subgraphs with unparalleled speed & storage-efficiency
BORDS is an open-access reaction search engine that leverages Google's Open Reaction Database to provide ultra-fast, comprehensive access to millions of chemical reactions. Built with a modern cloud stack, it streamlines reaction data extraction, transformation, and indexing for researchers in chemistry and related fields.
System for Managing the data generated by the SEAGrid Science Gateway
Designed to store and retrieve high-dimensional data, such as embeddings, efficiently. It enables fast similarity searches by leveraging techniques.
Examples of RAG (Retrieval-Augmented Generation) with Ethora, LangChain, and OpenAI. Build knowledge-based AI assistants fast. Powered by Ethora Chat Component.
Time series analysis showing trend, seasonality, and periodicity decomposition; and forecasting using Facebook Prophet. The analysis makes extensive use of indexing data tools and of the Pandas and Datetime libraries.
Python implementation of a TF-IDF/cosine based search engine
RESTful search API built with Flask and Elasticsearch. Features full-text search, data indexing, and query capabilities for Shakespeare plays dataset with scalable architecture and production-ready implementation.
A comprehensive guide to building a modern data warehouse using medallion Data Warehouse Architecture with SQL Server, including ETL processes, data modeling, and analytics.
Add a description, image, and links to the data-indexing topic page so that developers can more easily learn about it.
To associate your repository with the data-indexing topic, visit your repo's landing page and select "manage topics."