-
02:16
(UTC +05:30) - https://PyPiSan.com
- @funsanjeev
- in/funsanjeev
Lists (1)
Sort Name ascending (A-Z)
Stars
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
The Web framework for perfectionists with deadlines.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
Get your documents ready for gen AI
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
🤗 smolagents: a barebones library for agents that think in code.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
The official Python SDK for Model Context Protocol servers and clients
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Low-code framework for building custom LLMs, neural networks, and other AI models
Large Language Model Text Generation Inference
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
🐍 The official Python client library for Google's discovery based APIs.
ASCII generator (image to text, image to image, video to video)
Supercharge Your LLM with the Fastest KV Cache Layer
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…
yolov5 + csl_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)基于yolov5的旋转目标检测
A web interface to extract tabular data from PDFs
Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"(ICCV2025)
OctoTools: An agentic framework with extensible tools for complex reasoning



