Tags
Tags
This script downloads, converts and installs AdblockPlus lists into Privoxy.
Tracker for bugs and feature requests:
Bugtracker: privoxy-blocklist.sh
The PKGBUILD for this script can be found here:
http://aur.archlinux.org/packages.php?ID=43861
And the package can be installed from my repository.
It installs the script to '/usr/sbin/' and also a cronjob to '/etc/cron.weekly/'.
Warranty
Security issue
Dependencies:
Version history:
0.2:
There was a change of the filenames needed.
So to ensure that privoxy is working properly you have to remove all .action and .filter files BUT user.*, match-all.* and default.* in the configuration directory of your privoxy and in the config.
0.1:
The development version of this script can be found on github.com
#!/bin/bash # ###################################################################### # # Author: Andrwe Lord Weber # Mail: lord-weber-andrwe<at>renona-studios<dot>org # Version: 0.2 # URL: http://andrwe.dyndns.org/doku.php/blog/scripting/bash/privoxy-blocklist # ################## # # Sumary: # This script downloads, converts and installs # AdblockPlus lists into Privoxy # ###################################################################### ###################################################################### # # TODO: # - implement: # domain-based filter # ###################################################################### ###################################################################### # # script variables and functions # ###################################################################### # array of URL for AdblockPlus lists URLS=("https://easylist-downloads.adblockplus.org/easylistgermany.txt" "http://adblockplus.mozdev.org/easylist/easylist.txt") # privoxy config dir (default: /etc/privoxy/) CONFDIR=/etc/privoxy # directory for temporary files TMPDIR=/tmp/privoxy-blocklist TMPNAME=$(basename ${0}) ###################################################################### # # No changes needed after this line. # ###################################################################### function usage() { echo "${TMPNAME} is a script to convert AdBlockPlus-lists into Privoxy-lists and install them." echo " " echo "Options:" echo " -h: Show this help." echo " -q: Don't give any output." echo " -v 1: Enable verbosity 1. Show a little bit more output." echo " -v 2: Enable verbosity 2. Show a lot more output." echo " -v 3: Enable verbosity 3. Show all possible output and don't delete temporary files.(For debugging only!!)" echo " -r: Remove all lists build by this script." } [ ${UID} -ne 0 ] && echo -e "Root privileges needed. Exit.\n\n" && usage && exit 1 # check whether an instance is already running [ -e ${TMPDIR}/${TMPNAME}.lock ] && echo "An Instance of ${TMPNAME} is already running. Exit" && exit DBG=0 function debug() { [ ${DBG} -ge ${2} ] && echo -e "${1}" } function main() { cpoptions="" [ ${DBG} -gt 0 ] && cpoptions="-v" for url in ${URLS[@]} do debug "Processing ${url} ...\n" 0 file=${TMPDIR}/$(basename ${url}) actionfile=${file%\.*}.script.action filterfile=${file%\.*}.script.filter list=$(basename ${file%\.*}) # download list debug "Downloading ${url} ..." 0 wget -t 3 --no-check-certificate -O ${file} ${url} >${TMPDIR}/wget-${url//\//#}.log 2>&1 debug "$(cat ${TMPDIR}/wget-${url//\//#}.log)" 2 debug ".. downloading done." 0 [ "$(grep -E '^\[Adblock.*\]$' ${file})" == "" ] && echo "The list recieved from ${url} isn't an AdblockPlus list. Skipped" && continue # convert AdblockPlus list to Privoxy list # blacklist of urls debug "Creating actionfile for ${list} ..." 1 echo -e "{ +block{${list}} }" > ${actionfile} sed '/^!.*/d;1,1 d;/^@@.*/d;/\$.*/d;/#/d;s/\./\\./g;s/\?/\\?/g;s/\*/.*/g;s/(/\\(/g;s/)/\\)/g;s/\[/\\[/g;s/\]/\\]/g;s/\^/[\/\&:\?=_]/g;s/^||/\./g;s/^|/^/g;s/|$/\$/g;/|/d' ${file} >> ${actionfile} debug "... creating filterfile for ${list} ..." 1 echo "FILTER: ${list} Tag filter of ${list}" > ${filterfile} # set filter for html elements sed '/^#/!d;s/^##//g;s/^#\(.*\)\[.*\]\[.*\]*/s|<([a-zA-Z0-9]+)\\s+.*id=.?\1.*>.*<\/\\1>||g/g;s/^#\(.*\)/s|<([a-zA-Z0-9]+)\\s+.*id=.?\1.*>.*<\/\\1>||g/g;s/^\.\(.*\)/s|<([a-zA-Z0-9]+)\\s+.*class=.?\1.*>.*<\/\\1>||g/g;s/^a\[\(.*\)\]/s|<a.*\1.*>.*<\/a>||g/g;s/^\([a-zA-Z0-9]*\)\.\(.*\)\[.*\]\[.*\]*/s|<\1.*class=.?\2.*>.*<\/\1>||g/g;s/^\([a-zA-Z0-9]*\)#\(.*\):.*[:[^:]]*[^:]*/s|<\1.*id=.?\2.*>.*<\/\1>||g/g;s/^\([a-zA-Z0-9]*\)#\(.*\)/s|<\1.*id=.?\2.*>.*<\/\1>||g/g;s/^\[\([a-zA-Z]*\).=\(.*\)\]/s|\1^=\2>||g/g;s/\^/[\/\&:\?=_]/g;s/\.\([a-zA-Z0-9]\)/\\.\1/g' ${file} >> ${filterfile} debug "... filterfile created - adding filterfile to actionfile ..." 1 echo "{ +filter{${list}} }" >> ${actionfile} echo "*" >> ${actionfile} debug "... filterfile added ..." 1 debug "... creating and adding whitlist for urls ..." 1 # whitelist of urls echo "{ -block }" >> ${actionfile} sed '/^@@.*/!d;s/^@@//g;/\$.*/d;/#/d;s/\./\\./g;s/\?/\\?/g;s/\*/.*/g;s/(/\\(/g;s/)/\\)/g;s/\[/\\[/g;s/\]/\\]/g;s/\^/[\/\&:\?=_]/g;s/^||/\./g;s/^|/^/g;s/|$/\$/g;/|/d' ${file} >> ${actionfile} debug "... created and added whitelist - creating and adding image handler ..." 1 # whitelist of image urls echo "{ -block +handle-as-image }" >> ${actionfile} sed '/^@@.*/!d;s/^@@//g;/\$.*image.*/!d;s/\$.*image.*//g;/#/d;s/\./\\./g;s/\?/\\?/g;s/\*/.*/g;s/(/\\(/g;s/)/\\)/g;s/\[/\\[/g;s/\]/\\]/g;s/\^/[\/\&:\?=_]/g;s/^||/\./g;s/^|/^/g;s/|$/\$/g;/|/d' ${file} >> ${actionfile} debug "... created and added image handler ..." 1 debug "... created actionfile for ${list}." 1 # install Privoxy actionsfile cp ${cpoptions} ${actionfile} ${CONFDIR} if [ "$(grep $(basename ${actionfile}) ${CONFDIR}/config)" == "" ] then debug "\nModifying ${CONFDIR}/config ..." 0 sed "s/^actionsfile user\.action/actionsfile $(basename ${actionfile})\nactionsfile user.action/" ${CONFDIR}/config > ${TMPDIR}/config debug "... modification done.\n" 0 debug "Installing new config ..." 0 cp ${cpoptions} ${TMPDIR}/config ${CONFDIR} debug "... installation done\n" 0 fi # install Privoxy filterfile cp ${cpoptions} ${filterfile} ${CONFDIR} if [ "$(grep $(basename ${filterfile}) ${CONFDIR}/config)" == "" ] then debug "\nModifying ${CONFDIR}/config ..." 0 sed "s/^\(#*\)filterfile user\.filter/filterfile $(basename ${filterfile})\n\1filterfile user.filter/" ${CONFDIR}/config > ${TMPDIR}/config debug "... modification done.\n" 0 debug "Installing new config ..." 0 cp ${cpoptions} ${TMPDIR}/config ${CONFDIR} debug "... installation done\n" 0 fi debug "... ${url} installed successfully.\n" 0 done } # create temporary directory and lock file mkdir -p ${TMPDIR} touch ${TMPDIR}/${TMPNAME}.lock # set command to be run on exit [ ${DBG} -le 2 ] && trap "rm -fr ${TMPDIR};exit" INT TERM EXIT # loop for options while getopts ":hrqv:" opt do case "${opt}" in "h") usage exit 0 ;; "v") DBG="${OPTARG}" ;; "q") DBG=-1 ;; "r") echo "Do you really want to remove all build lists?(y/N)" read choice [ "${choice}" != "y" ] && exit 0 rm -rf ${CONFDIR}/*.script.{action,filter} && \ sed '/^actionsfile .*\.script\.action$/d;/^filterfile .*\.script\.filter$/d' -i ${CONFDIR}/config && \ echo "Lists removed." && exit 0 echo -e "An error occured while removing the lists.\nPlease have a look into ${CONFDIR} whether there are .script.* files and search for *.script.* in ${CONFDIR}/config." exit 1 ;; ":") echo "${TMPNAME}: -${OPTARG} requires an argument" >&2 exit 1 ;; esac done debug "URL-List: ${URLS}\nPrivoxy-Configdir: ${CONFDIR}\nTemporary directory: ${TMPDIR}" 2 main # restore default exit command trap - INT TERM EXIT [ ${DBG} -lt 2 ] && rm -r ${TMPDIR} [ ${DBG} -eq 2 ] && rm -vr ${TMPDIR} exit 0
@Sebastian:
Could you please explain the problems you have here:
Bugtracker: privoxy-blocklist.sh
Feature request, pls:
Support /usr/local for install prefix.
Support customizable names for action and filter file, i.e.
adblock.filter adblock.action
@cb:
I don't understand what do you mean by supporting /usr/local for install prefix.
You just can copy the script to any place you want and start it.
The only path which is set in the script using a variable is the config directory of privoxy (/etc/privoxy)
[…] privoxy-blocklist.sh […]
Hi Andrwe,
Awesome script! I can finally remove loads of ads while surfing with the iPad.
Is the script still maintained currently? Been a while since the last update. I'm really looking forward to domain based filters, meaning the loads of ||example.com^$third-party filters currently in easylist.
@Casper: Hi Casper,
Yes the script is still maintained.
In the last few month and unfortunately for the next few I'm very busy with my work.
But I'm definitly trying to update the script this year.
So please be a little more patient.
Andrwe: Thanks for the great script. It's perfect for blocking ads on mobile devices.
I use a third party list to disable social media buttons. Is there any way to modify the script to allow import of a list from a source such as: https://monzta.maltekraus.de/adblock_social.txt
G'day All
There seem too be an issue with the Size of ya filter per say, a filter can't be more than 4000-4990 characters in size, per filter, any thing more and privoxy will ignore the rest, ya might wish to breakup the extra large filter too a smaller, more manageable size filter's.
P.S I have well over 900 ad-Tracker filter's and each filter can hold as much as 4500 characters each, which run's extremely well and blocking rates are in the high numbers. If ya want sample of a filter let me know!
Cheers
@Wes: hi,
you just have to add this source to the URLS-array like this:
URLS=("https://easylist-downloads.adblockplus.org/easylistgermany.txt" "http://adblockplus.mozdev.org/easylist/easylist.txt" "https://monzta.maltekraus.de/adblock_social.txt")
Regarding bug 9: On Ubuntu 11.04, bash version 4.2.8(1), lines 210, 211 and 212 in the current git version are missing && after the ... part, error is privoxy-blocklist.sh: line 210: syntax error near unexpected token `echo' privoxy-blocklist.sh: line 210: `-z "${PRIVOXY_CONF}" echo „\$PRIVOXY_CONF isn't set please either provice a valid initscript config or set it in ${SCRIPTCONF} .“ >&2 && exit 1'
After changing -z "${PRIVOXY_CONF}" echo … to -z "${PRIVOXY_CONF}" && echo … the script works. It is probably not a bad idea to use a more conservative syntax if possible.
Thank You for a great script!
The last comment is missing square brackets, but I'm certain You understand the problem.
Hy. I use privoxy on a router with openwrt. when i execute your script i get this error message.
root@OpenWrt /opt# bash blocklist.sh Processing https://easylist-downloads.adblockplus.org/easylistgermany.txt …
Downloading https://easylist-downloads.adblockplus.org/easylistgermany.txt … .. downloading done. grep: /opt/privoxy-blocklist/easylistgermany.txt: No such file or directory The list recieved from https://easylist-downloads.adblockplus.org/easylistgermany.txt isn't an AdblockPlus list. Skipped Processing http://adblockplus.mozdev.org/easylist/easylist.txt …
Downloading http://adblockplus.mozdev.org/easylist/easylist.txt … Segmentation fault .. downloading done. grep: /opt/privoxy-blocklist/easylist.txt: No such file or directory The list recieved from http://adblockplus.mozdev.org/easylist/easylist.txt isn't an AdblockPlus list. Skipped root@OpenWrt /opt#
What could this be?
@14:
Hi,
is there enough space on /opt/ for the lists?
Can you please provide the output of bash blocklist.sh -v 2
?
It be best if you provide the output using something like pastebin.org because it can be a lot of text.
Andrwe Lord Weber
Hi,
I'm glad I found this script, but I got the error below. It says that the list downloaded from easylist-downloads.adblockplus.org are not AdblockPlus lists and that there are no such files or directory.
Also, I would like to use Privoxy with some porn site's BlackList for my home network such as the ones used with squidguard (http://www.squidguard.org/blacklists.html). Can they be simply added to the script after the URLS-array? As mentionned to wes (message 8). The script will probably not unzip files, is that right?
Thank you
Generated Error Message *
Processing https://easylist-downloads.adblockplus.org/easylistgermany.txt …
Downloading https://easylist-downloads.adblockplus.org/easylistgermany.txt … .. downloading done. grep: /tmp/privoxy-blocklist/easylistgermany.txt: No such file or directory The list recieved from https://easylist-downloads.adblockplus.org/easylistgermany.txt isn't an AdblockPlus list. Skipped Processing http://adblockplus.mozdev.org/easylist/easylist.txt …
Downloading http://adblockplus.mozdev.org/easylist/easylist.txt … .. downloading done. grep: /tmp/privoxy-blocklist/easylist.txt: No such file or directory The list recieved from http://adblockplus.mozdev.org/easylist/easylist.txt isn't an AdblockPlus list. Skipped
Following my previous message, here is the privoxy-blocklist.sh -v2 output.
-v2 output *
URL-List: https://easylist-downloads.adblockplus.org/easylistgermany.txt Privoxy-Configdir: /usr/local/etc/privoxy Temporary directory: /tmp/privoxy-blocklist Processing https://easylist-downloads.adblockplus.org/easylistgermany.txt …
Downloading https://easylist-downloads.adblockplus.org/easylistgermany.txt … /Users/lorenzo/Desktop/privoxy-blocklist.sh: line 86: wget: command not found .. downloading done. grep: /tmp/privoxy-blocklist/easylistgermany.txt: No such file or directory The list recieved from https://easylist-downloads.adblockplus.org/easylistgermany.txt isn't an AdblockPlus list. Skipped Processing http://adblockplus.mozdev.org/easylist/easylist.txt …
Downloading http://adblockplus.mozdev.org/easylist/easylist.txt … /Users/lorenzo/Desktop/privoxy-blocklist.sh: line 86: wget: command not found .. downloading done. grep: /tmp/privoxy-blocklist/easylist.txt: No such file or directory The list recieved from http://adblockplus.mozdev.org/easylist/easylist.txt isn't an AdblockPlus list. Skipped /tmp/privoxy-blocklist/privoxy-blocklist.sh.lock /tmp/privoxy-blocklist/wget-http:##adblockplus.mozdev.org#easylist#easylist.txt.log /tmp/privoxy-blocklist/wget-https:##easylist-downloads.adblockplus.org#easylistgermany.txt.log /tmp/privoxy-blocklist mbp-i5:~ lorenzo$
So sorry, I just noticed that I was missing the wget dependency. Installation ran successfully but now my browser (well, privoxy) is refusing any connexion.
After running the script, the new .action line in the config file is not writen properly. It ends up as: „actionsfile easylist.script.actionnactionsfile user.action“ and „filterfile easylist.script.filternfilterfile user.filter“.
Privoxy doesn't accept connexions anymore until reinstal.
config actionsfile match-all.action # Actions that are applied to all sites and maybe overruled later on. actionsfile default.action # Main actions file actionsfile easylist.script.actionnactionsfile user.action # User customizations
filterfile default.filter filterfile easylist.script.filternfilterfile user.filter # User customizations
Sorry for multiple post.
Well this made me angry. Thanks. What Lorenzo describes above is correct.
What Lorenzo doesn't perhaps make crystal clear is that because the script is seemingly crappily coded to not work („actionsfile easylist.script.actionnactionsfile“ etc) and it is not clear how to change „config“ TO make it work that you should *steer clear of this stupid thing altogether* or face having to reinstall privoxy all over again because even deleting the silly error changes (i.e. „easylist.script.filternfilterfile“) that this wrong script adds doesn't seem to fix what ever it broke
@Lebowski
The script do work , I could fix the problem by upgrading the dependencies.
I now have bash 4.2.24(2), wget 1.13.4, sed 4.2.1 and the script works correctly.
./privoxy-blocklist.sh: 33: Syntax error: „(“ unexpected Hmm, weird… This looks correct to me.
Still actual?
Try „bash privoxy-blacklist.sh“ instead of „sh privoxy-blacklist.sh“
Can you read more about the actions manual from http://www.privoxy.org/user-manual/actions-file.html, and then modify your script to deal with the domain and path correctly.
About that „./privoxy-blocklist.sh: 33: Syntax error: “(„ unexpected Hmm, weird… This looks correct to me.“
I've encountered that on the ksh of my OpenBSD setup. There are two things to be taken into account:
1. Change the path to bash from /bin/bash to wherever bash is located (/usr/local/bin/bash in my case)
2. Run the script usind bash or the full path to the bash executable.
Boom it works. Even on OpenBSD, which sure isn't Arch Linux :D
Hello,
I've the same problem. About that “./privoxy-blocklist.sh: 33: Syntax error: “(“ My server is on Debian Squeeze Do you have an idea for this problem ? Cordialy Guillaume
Seems to work for me, no errors what I can tell. However, when moving to privoxy from AdBlock+, the pesky facebook sponsored sidebar ads reappeared, even with the adp+ patterns implemented in privoxy. Anyone know why?
@wrox: Facebook uses SSL encrypted connections. Thats the reason why Privoxy is not able to remove Ads from an SSL connection (Privoxy is unable by design to decrypt SSL connections). AdBlock can do that because its a Browser Plugin which has access to the unencrypted content.
I got this to work just fine on the latest OpenWRT trunk. You have to install bash, wget, and sed. Don't use the wget and sed that's built into busybox, it will fail.
Hi Andrwe,
Thank you for your script. it worked great for me.
But I have a question (or feature-request )
Since Adblock+ has an export-function to export all filterrules, including my own filterrules I „collected“ over the past years: is there a way to convert and add them to privoxy with a modified version of your script?
Thanks. I've been looking for something like this for a long time.
The script is generating an error with Debian Stretch: sed: Die Syntax für Zeichenklasse lautet space, nicht [:space:] or in english: character class syntax is space, not [:space:] What di I have to Change?
It seems that there is a escape char in front of the first : missing! Original part: *[:[^:]]*[^:]* Changed prt: *[\:[^:]]*[^:]*
I hope that this change will do what the script has to do.
Hello. Is it possible to support ublock/umatrix lists in feature? All, i could find, is this discussion: https://www.reddit.com/r/privacy/comments/3cn7md/ublock_origin_vs_privoxy/ ….where SAI_Peregrinus wrote: „Not really. eg uMatrix can block all requests from site A to site B, but allow requests from site C to site B. Privoxy would just block all requests to B.“
I never tried to learn ther format, just using ready to work solutions (in ublock origin - checking for download various lists).
@Nikita: I mentioned umatrix erroneously, just because noticed it mentioned somewhere in one context with ublock. Did not know, that they use different rules syntax.
So, this question is about ublock→privoxy convertion.
I wonder, because have seen mention about ublock's extended syntax, compared to original adblock's syntax.
I also have the problem that haroon described, The filter file for privoxy isn't generated any longer because of: „sed: character class syntax is space, not [:space:]“ . There's a problem with the regex. Maybe the format of the Adblock files has changed somewhat so this didn't occure in the past. Any idea what to change?
Hello,
I don't think this script works anymore.