| English | Español |
BlackWeb is a project that collects and unifies public blocklists of domains (porn, downloads, drugs, malware, spyware, trackers, bots, social networks, warez, weapons, etc.) to make them compatible with Squid-Cache.
| ACL | Blocked Domains | File Size |
|---|---|---|
| blackweb.txt | 5318401 | 138,3 MB |
git clone --depth=1 https://github.com/maravento/blackweb.gitblackweb.txt is already updated and optimized for Squid-Cache. Download it and unzip it in the path of your preference and activate Squid-Cache RULE.
wget -q -c -N https://raw.githubusercontent.com/maravento/blackweb/master/blackweb.tar.gz && cat blackweb.tar.gz* | tar xzf -#!/usr/bin/env bash
base_url="https://raw.githubusercontent.com/maravento/blackweb/master/blackweb.tar.gz."
for num in {000..999}; do
file="${base_url}${num}"
echo "Check: $file"
if wget --spider "$file" 2>/dev/null; then
wget -q -c --timestamping --no-check-certificate --retry-connrefused --timeout=10 --tries=4 --show-progress "$file"
else
break
fi
done
cat blackweb.tar.gz* | tar xzf -wget -q -c -N https://raw.githubusercontent.com/maravento/blackweb/master/checksum.md5
md5sum blackweb.txt | awk '{print $1}' && cat checksum.md5 | awk '{print $1}'BlackWeb Rule for Squid-Cache
Edit:
/etc/squid/squid.confAnd add the following lines:
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
# Block Rule for Blackweb
acl blackweb dstdomain "/path_to/blackweb.txt"
http_access deny blackwebBlackWeb contains millions of domains, therefore it is recommended:
Use
allowdomains.txtto exclude domains (e.g.: accounts.youtube.com since Feb 2014, Google uses the subdomain accounts.youtube.com to authenticate its services) or false positives. / Usarallowdomains.txtpara excluir dominios (ejemplo: accounts.youtube.com desde Feb 2014, Google utiliza el subdominio accounts.youtube.com para autenticar sus servicios) o falsos positivos.
acl allowdomains dstdomain "/path_to/allowdomains.txt"
http_access allow allowdomainsUse
blockdomains.txtto add domains not included inblackweb.txt(e.g.: .youtube.com .googlevideo.com, .ytimg.com, etc).
acl blockdomains dstdomain "/path_to/blockdomains.txt"
http_access deny blockdomainsUse
blocktlds.txtto block gTLD, sTLD, ccTLD, etc.
acl blocktlds dstdomain "/path_to/blocktlds.txt"
http_access deny blocktldsInput:
.bardomain.xxx
.subdomain.bardomain.xxx
.bardomain.ru
.bardomain.adult
.foodomain.com
.foodomain.pornOutput:
.foodomain.comUse this rule to block Punycode - RFC3492, IDN | Non-ASCII (TLDs or Domains), to prevent an IDN homograph attack. For more information visit welivesecurity: Homograph attacks.
acl punycode dstdom_regex -i \.xn--.*
http_access deny punycodeInput:
.bücher.com
.mañana.com
.mydomain.org
.net
.xn--fiqz9s
.xn--p1aiOutput:
.mydomain.org
.netUse this rule to block words (Optional. Can generate false positives).
acl blockwords url_regex -i "/etc/acl/blockwords.txt"
http_access deny blockwordsInput:
.bittorrent.com
https://www.google.com/search?q=torrent
https://www.google.com/search?q=mydomain
https://www.google.com/search?q=porn
.mydomain.comOutput:
https://www.google.com/search?q=mydomain
.mydomain.com# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
# Allow Rule for Domains
acl allowdomains dstdomain "/path_to/allowdomains.txt"
http_access allow allowdomains
# Block Rule for Punycode
acl punycode dstdom_regex -i \.xn--.*
http_access deny punycode
# Block Rule for gTLD, sTLD, ccTLD
acl blocktlds dstdomain "/path_to/blocktlds.txt"
http_access deny blocktlds
# Block Rule for Words (Optional)
acl blockwords url_regex -i "/etc/acl/blockwords.txt"
http_access deny blockwords
# Block Rule for Domains
acl blockdomains dstdomain "/path_to/blockdomains.txt"
http_access deny blockdomains
# Block Rule for Blackweb
acl blackweb dstdomain "/path_to/blackweb.txt"
http_access deny blackwebBlackWeb is designed exclusively for Squid-Cache and due to the large number of blocked domains it is not recommended to use it in other environments (DNSMasq, Pi-Hole, etc.), or add it to the Windows Hosts File, as it could slow down or crash it. Use it at your own risk.
For more information check Issue 10
This section is only to explain how update and optimization process works. It is not necessary for user to run it. This process can take time and consume a lot of hardware and bandwidth resources, therefore it is recommended to use test equipment.
The update process of
blackweb.txtconsists of several steps and is executed in sequence by the scriptbwupdate.sh. The script will request privileges when required.
wget -q -N https://raw.githubusercontent.com/maravento/blackweb/master/bwupdate/bwupdate.sh && chmod +x bwupdate.sh && ./bwupdate.shUpdate requires python 3x and bash 5x.
pkgs='wget git curl libnotify-bin idn2 perl tar rar unrar unzip zip python-is-python3'
if ! dpkg -s $pkgs >/dev/null 2>&1; then
apt-get install $pkgs
fiMake sure your Squid is installed correctly. If you have any problems, run the following script: (
sudo ./squid_install.sh):
#!/bin/bash
# kill old version
while pgrep squid > /dev/null; do
echo "Waiting for Squid to stop..."
killall -s SIGTERM squid &>/dev/null
sleep 5
done
# squid remove (if exist)
apt purge -y squid* &>/dev/null
rm -rf /var/spool/squid* /var/log/squid* /etc/squid* /dev/shm/* &>/dev/null
# squid install (you can use 'squid-openssl' or 'squid')
apt install -y squid-openssl squid-langpack squid-common squidclient squid-purge
# create log
if [ ! -d /var/log/squid ]; then
mkdir -p /var/log/squid
fi &>/dev/null
if [[ ! -f /var/log/squid/{access,cache,store,deny}.log ]]; then
touch /var/log/squid/{access,cache,store,deny}.log
fi &>/dev/null
# permissions
chown -R proxy:proxy /var/log/squid
# enable service
systemctl enable squid.service
systemctl start squid.service
echo "Done"Capture domains from downloaded public blocklists (see SOURCES) and unifies them in a single file.
Remove overlapping domains (
'.sub.example.com' is a subdomain of '.example.com'), does homologation to Squid-Cache format and excludes false positives (google, hotmail, yahoo, etc.) with a allowlist (debugwl.txt).
Input:
com
.com
.domain.com
domain.com
0.0.0.0 domain.com
127.0.0.1 domain.com
::1 domain.com
domain.com.co
foo.bar.subdomain.domain.com
.subdomain.domain.com.co
www.domain.com
www.foo.bar.subdomain.domain.com
domain.co.uk
xxx.foo.bar.subdomain.domain.co.ukOutput:
.domain.com
.domain.com.co
.domain.co.ukRemove domains with invalid TLDs (with a list of Public and Private Suffix TLDs: ccTLD, ccSLD, sTLD, uTLD, gSLD, gTLD, eTLD, etc., up to 4th level 4LDs).
Input:
.domain.exe
.domain.com
.domain.edu.caOutput:
.domain.com
.domain.edu.caRemove hostnames larger than 63 characters (RFC 1035) and other characters inadmissible by IDN and convert domains with international characters (non ASCII) and used for homologous attacks to Punycode/IDNA format.
Input:
.президент.рф
.mañana.com
.bücher.com
.café.fr
.köln-düsseldorfer-rhein-main.de
.mūsųlaikas.lt
.sendesık.comOutput:
.xn--d1abbgf6aiiy.xn--p1ai
.xn--maana-pta.com
.xn--bcher-kva.com
.xn--caf-dma.fr
.xn--kln-dsseldorfer-rhein-main-cvc6o.de
.xn--mslaikas-qzb5f.lt
.xn--sendesk-wfb.comMost of the SOURCES contain millions of invalid and nonexistent domains. Then, a double check of each domain is done (in 2 steps) via DNS and invalid and nonexistent are excluded from Blackweb. This process may take. By default it processes domains in parallel ≈ 6k to 12k x min, depending on the hardware and bandwidth.
HIT google.com
google.com has address 142.251.35.238
google.com has IPv6 address 2607:f8b0:4008:80b::200e
google.com mail is handled by 10 smtp.google.com.
FAULT testfaultdomain.com
Host testfaultdomain.com not found: 3(NXDOMAIN)For more information, check internet live stats
Remove government domains (.gov) and other related TLDs from BlackWeb.
Input:
.argentina.gob.ar
.mydomain.com
.gob.mx
.gov.uk
.navy.milOutput:
.mydomain.comRun Squid-Cache with BlackWeb and any error sends it to
SquidError.txton your desktop.
BlackWeb: Done 06/05/2023 15:47:14- The default path of BlackWeb is
/etc/acl. You can change it for your preference. bwupdate.shincludes lists of remote support related domains (Teamviewer, Anydesk, logmein, etc) and web3 domains. They are commented by default (unless their domains are in SOURCES). To block or exclude them you must activate the corresponding lines in the script (# JOIN LIST), although it is not recommended to avoid conflicts or false positives.- If you need to interrupt the execution of
bwupdate.sh(ctrl + c) and it stopped at the DNS Loockup part, it will restart at that point. If you stop it earlier, you will have to start from the beginning or modify the script manually so that it starts from the desired point. - If you use
aufs, temporarily change it toufsduring the upgrade, to avoid:ERROR: Can't change type of existing cache_dir aufs /var/spool/squid to ufs. Restart required. - If someone considers that a domain should not be on Blackweb, you can create an Issue and notify it to remove it.
- ABPindo - indonesianadblockrules
- abuse.ch - hostfile
- Adaway - host
- adblockplus - advblock Russian
- adblockplus - antiadblockfilters
- adblockplus - easylistchina
- adblockplus. - easylistlithuania
- anudeepND - adservers
- anudeepND - coinminer
- AssoEchap - stalkerware-indicators
- azet12 - KADhosts
- badmojr - 1Hosts
- BarbBlock - blacklists
- BBcan177 - minerchk
- BBcan177 - MS-2
- BBcan177 - referrer-spam-blacklist
- betterwebleon - slovenian-list
- bigdargon - hostsVN
- BlackJack8 - iOSAdblockList
- BlackJack8 - webannoyances
- blocklistproject - everything
- cert.pl - List of malicious domains
- chadmayfield - porn top
- chadmayfield - porn_all
- chainapsis - phishing-block-list
- cjx82630 - Chinese CJX's Annoyance List
- cobaltdisco - Google-Chinese-Results-Blocklist
- crazy-max - WindowsSpyBlocker
- CriticalPathSecurity - Public-Intelligence-Feeds
- DandelionSprout - adfilt
- Dawsey21 - adblock-list
- Dawsey21 - main-blacklist
- developerdan - ads-and-tracking-extended
- digitalside - Threat-Intel
- Disconnect.me - simple_ad
- Disconnect.me - simple_malvertising
- Disconnect.me - simple_tracking
- dorxmi - nothingblock
- Eallion - uBlacklist
- EasyList - EasyListHebrew
- ethanr - dns-blacklists
- fabriziosalmi - blacklists
- FadeMind - 2o7Net
- FadeMind - Risk
- FadeMind - Spam
- firebog - AdguardDNS
- firebog - Admiral
- firebog - Easylist
- firebog - Easyprivacy
- firebog - Kowabit
- firebog - neohostsbasic
- firebog - Prigent-Ads
- firebog - Prigent-Crypto
- firebog - Prigent-Malware
- firebog - RPiList-Malware
- firebog - RPiList-Phishing
- firebog - WaLLy3K
- frogeye - firstparty-trackers-hosts
- gardar - Icelandic ABP List
- greatis - Anti-WebMiner
- hagezi - dns-blocklists
- heradhis - Indonesian ABPindo
- hexxium - threat-list/
- hoshsadiq - adblock-nocoin-list
- hostsfile - hosts
- jawz101 - potentialTrackers
- jdlingyu - ad-wars
- joelotz - URL_Blacklist
- liamja - Prebake Filter Obtrusive Cookie Notices
- malware-filter - URLhaus Malicious URL Blocklist
- malware-filter.- phishing-filter-hosts
- Matomo-org - spammers
- MBThreatIntel - malspam
- mine.nu - hosts0
- mitchellkrogza - Badd-Boyz-Hosts
- mitchellkrogza - hacked-domains
- mitchellkrogza - nginx-ultimate-bad-bot-blocker
- mitchellkrogza - strip_domains
- molinero - hBlock
- NanoAdblocker - NanoFilters
- neodevpro - neodevhost
- notabug - Latvian List
- notracking - hosts-blocklists
- Oleksiig - Squid-BlackList
- openphish - feed
- pengelana - domains blocklist
- Perflyst - PiHoleBlocklist Android
- Perflyst - PiHoleBlocklist SmartTV
- phishing.army - phishing_army_blocklist_extended
- piperun - iploggerfilter
- Quedlin - domains
- quidsup - notrack-blocklists
- quidsup - notrack-malware
- reddestdream - MinimalHostsBlocker
- RooneyMcNibNug - pihole-stuff
- Rpsl - adblock-leadgenerator-list
- ruvelro - Halt-and-Block-Mining
- ryanbr - fanboy-adblock
- scamaNet - blocklist
- Someonewhocares - hosts
- stanev.org - Bulgarian adblock list
- StevenBlack - add.2o7Net
- StevenBlack - add.Risk
- StevenBlack - fakenews-gambling-porn-social
- StevenBlack - hosts
- StevenBlack - spam
- StevenBlack - uncheckyAds
- Stopforumspam - Toxic Domains
- Taz - SpamDomains
- tomasko126 - Easylist Czech and Slovak filter list
- txthinking - blackwhite
- txthinking - bypass china domains
- Ultimate Hosts Blacklist - hosts
- Université Toulouse 1 Capitole - Blacklists UT1
- vokins - yhosts
- Winhelp2002 - hosts
- yourduskquibbles - Web Annoyances Ultralist
- yous - YousList
- yoyo - Peter Lowe’s Ad and tracking server list
- zerodot1 - CoinBlockerLists list_browser
- zerodot1 - CoinBlockerLists list
- zerodot1 - CoinBlockerLists list_optional
- zoso - Romanian Adblock List
- 280blocker - 280blocker_domain
- abuse.ch - Ransomware Abuse CryptoWall
- abuse.ch - Ransomware Abuse Domain Blocklist
- abuse.ch - Ransomware Abuse Locky
- abuse.ch - Ransomware Abuse RW_URLB
- abuse.ch - Ransomware Abuse TorrentLocker
- abuse.ch - Ransomware Abuse URL Blocklist
- abuse.ch - Zeustracker
- adblockplus - malwaredomains_full
- BambenekConsulting - dga-feed
- Carl - Spam
- cedia - domains
- cedia - immortal_domains
- CHEF-KOCH - BarbBlock-filter-list
- Cibercrime-Tracker
- dshield - High
- dshield - Low
- dshield - Medium
- firebog - Airelle-hrsk
- firebog - Airelle-trc
- firebog - BillStearns
- firebog - Prigent-Phishing
- firebog - Shalla-mal
- gfmaster - adblock-korea
- hosts-file - ad_servers
- hosts-file - emd
- hosts-file - grm
- hosts-file - hosts
- hosts-file - psh
- Joewein - dom-bl-base
- Joewein - dom-bl
- malc0de - bl
- Malware Domains - justdomains
- Malwaredomainlist - Hosts
- MESD squidguard - blacklists
- Netlab360 - DGA Domains
- nfz.moe - hosts
- Passwall - SpamAssassin
- ShadowWhisperer - BlockLists
- Shallalist.de - shallalist
- squidblacklist.org - dg-ads
- squidblacklist.org - dg-malicious.acl
- tankmohit - UnifiedHosts
- UrlBlacklist - UrlBlacklist
- zerodot1 - list_browser_UBO
- firebog sources
- google supported domains
- iana
- ipv6-hosts (Partial)
- publicsuffix
- Ransomware Database
- University Domains and Names Data List
- whoisxmlapi
- Awesome Open Source: Blackweb
- community ipfire: url filter and self updating blacklists
- covert.io: Getting Started with DGA Domain Detection Research
- crazymax: WindowsSpyBlocker
- Jason Trost: Getting Started with DGA Domain Detection Research
- kandi.openweaver: Domains Blocklist for Squid-Cache
- Kerry Cordero: Blocklists of Suspected Malicious IPs and URLs
- Keystone Solutions: blocklists
- Lifars: Sites with blocklist of malicious IPs and URLs
- opensourcelibs: Blackweb
- OSINT Framework: Domain Name/Domain Blacklists/Blackweb
- osintbay: blackweb
- Secrepo: Samples of Security Related Data
- Segu-Info: Análisis de malware y sitios web en tiempo real
- Segu-Info: Dominios/TLD dañinos que pueden ser bloqueados para evitar spam y #phishing
- Soficas: CiberSeguridad - Protección Activa
- stackoverflow: Blacklist IP database
- Wikipedia: Blacklist_(computing)
- Xploitlab: Projects using WindowsSpyBlocker
- Zeltser: Free Blocklists of Suspected Malicious IPs and URLs
- zenarmor: How-to-enable-web-filtering-on-OPNsense-proxy?
We thank all those who have contributed to this project. Those interested can contribute, sending us links of new lists, to be included in this project.
Special thanks to: Jhonatan Sneider
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Due to recent arbitrary changes in computer terminology, it is necessary to clarify the meaning and connotation of the term blacklist, associated with this project:
In computing, a blacklist, denylist or blocklist is a basic access control mechanism that allows through all elements (email addresses, users, passwords, URLs, IP addresses, domain names, file hashes, etc.), except those explicitly mentioned. Those items on the list are denied access. The opposite is a whitelist, which means only items on the list are let through whatever gate is being used. Source Wikipedia
Therefore, blacklist, blocklist, blackweb, blackip, whitelist and similar, are terms that have nothing to do with racial discrimination.
