-
ScienciaLAB
- Planet earth
- https://www.sciencialab.com
- https://orcid.org/0000-0002-6114-6164
- in/lfoppiano
- @sciencialab.com
- @whitenoise
-
jwarc Public
Forked from iipc/jwarcJava library for reading and writing WARC files with a typed API
Java Apache License 2.0 UpdatedDec 29, 2025 -
streamlit-pdf-viewer Public
Streamlit PDF viewer
-
nutch Public
Forked from apache/nutchApache Nutch is an extensible and scalable web crawler
Java Apache License 2.0 UpdatedDec 11, 2025 -
solution_documentation Public
Forked from SoFairOA/solution_documentationCreative Commons Attribution 4.0 International UpdatedDec 10, 2025 -
software-mentions Public
Forked from softcite/software-mentionsFinding citations to research software from within the academic literature
-
pdf-tei-editor Public
Forked from mpilhlt/pdf-tei-editorA viewer/editor web app to compare the PDF source and the TEI extraction/annotation result
JavaScript Creative Commons Zero v1.0 Universal UpdatedNov 12, 2025 -
structure-vision Public
Viewer for the structure extracted by Grobid on PDF documents
-
aws-grobid Public
Forked from evamaxfield/aws-grobidDeploy GROBID to AWS EC2
Python Mozilla Public License 2.0 UpdatedOct 14, 2025 -
software_mentions_client Public
Forked from softcite/software_mentions_clientA Python client for the Softcite software mention recognizer server
Python Apache License 2.0 UpdatedOct 2, 2025 -
document-qa Public
Scientific Document Insight Q/A
-
-
pdfalto Public
Forked from kermitt2/pdfaltoPDF to XML ALTO file converter
C GNU General Public License v2.0 UpdatedAug 6, 2025 -
-
datastet Public
Forked from kermitt2/datastetFinding mentions and citations to named and implicit research datasets from within the academic literature
JavaScript Apache License 2.0 UpdatedJul 1, 2025 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
-
hedgehog Public
A collection of applications and utilities of text extraction applied to several domains (history, geography, ...)
-
grobid-quantities Public
GROBID extension for identifying and normalizing physical quantities.
-
-
Python client for Grobid Quantities
-
grobid Public
Forked from kermitt2/grobidA machine learning software for extracting information from scholarly documents
-
-
pdf-extraction-benchmarks Public
Forked from py-pdf/benchmarksBenchmarking PDF libraries
Python BSD 3-Clause "New" or "Revised" License UpdatedMay 26, 2025 -
grobid-superconductors Public
Grobid module for superconductor material and properties extraction
-
Pub2TEI Public
Forked from kermitt2/Pub2TEIService for converting and enhancing heterogeneous publisher XML formats into TEI
XSLT Apache License 2.0 UpdatedMay 17, 2025 -
BlingFire Public
Forked from microsoft/BlingFireA lightning fast Finite State machine and REgular expression manipulation library.
-
awesome-materials-informatics Public
Forked from tilde-lab/awesome-materials-informaticsCurated list of known efforts in materials informatics, i.e. in modern materials science
UpdatedMay 7, 2025 -
material-parsers Public
Material parsers and other tools, scripts Initially developed for Grobid Superconductor
-
Wapiti Public
Forked from kermitt2/WapitiA simple and fast discriminative sequence labeling toolkit
C Other UpdatedFeb 18, 2025 -
grobid_client_python Public
Forked from kermitt2/grobid-client-pythonPython client for GROBID Web services
Python Apache License 2.0 UpdatedJan 1, 2025





