Application URL: http://threat-feeds.us-east-2.elasticbeanstalk.com/
Submission for the SANS AI Cybersecurity Hackathon.
Threat Feeds is a web application that allows users to browse, search and ask questions about the latest threat reports published across the security industry.
- Filter threat reports by title, source and publish date
- Full text search across threat report contents
- Ask AI any questions on the contents of the threat reports
- Automatic IOC extraction (hashes, IP addresses, domain names, CVEs, MITRE Attack types, YARA rules) for each threat report
- Automatic context-based false positive IOC detection for each threat report
- VirusTotal, NIST and MITRE enrichments for IOCs
- Related reports or "more like this" feature for each threat report
- APIs for listing and searching reports, retrieving a particular report and the Q&A feature.
Jupyter notebooks were used for quick iteration, but are structured in such a way that they can be run as-is by converting them into Python scripts using the command
jupyter nbconvert --to script <script>.ipynb
- threat_report_parsing.ipyb: Crawl the latest pages from the RSS feeds, store the raw page data, parsed page data, extract IOCs, persist the metadata and index the documents into a Whoosh search index.
- false_positive_iocs.ipynb: Run an LLM over each IOC extracted from a report along with its context, and determine if the IOC is valid and relevant.
- enrich_reports.ipynb: For each hash found in a report, link the VirusTotal URL if it exists. For each MITRE Attack IOC found in a report, link the corresponding page on MITRE.
- similarity.ipynb: Compute related reports for each report by chunking them up, computing embeddings and storing them in ChromaDB. Also upload the documents to a Pinecone Assistant for the Q&A feature.
- migrate.ipynb: All the necessary data is copied over to an AWS PostgreSQL instance that is the actual "production" instance that the app interacts with.
Considering this is a hackathon submission, the solution to update the latest reports is pretty hacky. Run the scripts in the following order.
- threat_report_parsing.ipyb
- false_positive_iocs.ipynb
- enrich_reports.ipynb
- similarity.ipynb
- migrate.ipynb
From the eb-flask directory, run the following
zip ../eb-flask.zip -r * .[^.]* -x __pycache__/\* -x flask-app-venv/\*
Upload the compressed file to AWS Elastic Beanstalk.
- User generated content
- Votes and comments
- Upload custom, private threat reports
- Share threat reports privately
- Chatbot for longer conversations about the threat report contents
- Integrations - OpenCTI, SOAR enrichment plugins etc.
