-
We use arxiv sanity preserver to fetch and download papers. Make sure run the commands first. By default, clone arxiv sanity preserver into the same directory, otherwise, don't forget to change
arxiv_sanity_pdf_path. -
Install detex. We will use this open package to remove the LaTeX format.
-
After downloading the pdf files, I recommend you change
user_agentto your own in arxiv.py file because sometimes arxiv will still block the existent one. Runpython arxiv.pyscript to process the files. -
All the files are processed and stored in data folder.
BruceChaun/ArxivInference
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|