Re: Balsa can't download a mail with a long line, gets stuck in endless dup-creating loop.
- From: Rob Landley <rob landley net>
- To: Jack Ostroff <ostroffjh sbcglobal net>
- Cc: balsa-list gnome org
- Subject: Re: Balsa can't download a mail with a long line, gets stuck in endless dup-creating loop.
- Date: Thu, 01 Nov 2012 16:50:12 -0500
On 10/31/2012 09:15:38 AM, Jack Ostroff wrote:
On 2012.10.30 19:54, Rob Landley wrote:
On 10/29/2012 05:11:47 PM, Jack wrote:
On 2012.10.29 17:46, Rob Landley wrote:
.....
...I haven't exited balsa to run my filters yet
You can run filters under menu "Mailbox/Select Filters"
If I could get Balsa to actually match a list-id tag, I would.
http://landley.net/notes-2012.html#15-10-2012
I haven't read all of this yet, so I don't know what you've tried, or
why it isn't working, but would there be any point in starting a new
thread about this problem with filters?
If I can get it built from source, sure. If I can't, not much point.
Separately, although I can imagine you're already down a particular
path to working through that huge inbox, is it worth considering a
slightly different approach? What about writing a script to break
that huge mbox into smaller chunks - with maybe up to a few thousand
messages in each? Whether you process with Balsa or your scripts, it
might be easier (and safer?) to work incrementally. Sorry if you've
already thought of this and have a good reason for not.
I already dealt with the mbox backlog, the python script that sorts it
into mboxes is attached if you're bored.
I'm happy to debug balsa's built-in filtering more, but I couldn't get
it to work and have bypassed it for the moment.
I spent a couple hours trying to figure out how to phrase it and
never got a single message moved out of the inbox (imap or local),
so I gave up and wrote a python program.
I've sometimes had luck doing bulk filtering by using the filter
dropdown at the top of the main window. For example, if I filter the
inbox by "Subject or Sender contains:" on the address that sends a
particular list, I may still have to "ctl-click" select specific
messages, but it's easier than looking at the complete list. Once I
have a bunch selected, I can right-click and "move-to" another
mailbox.
I could never get a rule to trigger on a list-id: header in a message.
(With or without colon, exact case matching, and so on.) Most of my
filtering is done on those.
This was the exact string match, the regex match says it's not
implemented in the version Ubuntu ships.
It would be nice if I could get balsa to check how many new messages
are in each folder at startup time without me having to click on
each folder before it notices "hey, my cached metadata is stale!"
and reparsing the mbox file. But so far, I havent' found a way to
get it to do that.
I have a combination of mbox and Maildir, and having Balsa check mail
again (even if it doesn't actually find and download anything) seems
to get it to recognize which mailboxes have new messages.
$ ls -l linux-kernel
-rw------- 1 landley landley 2229420857 Nov 1 11:39 linux-kernel
$ time cat linux-kernel > /dev/null
real 0m42.225s
user 0m0.080s
sys 0m5.292s
A full rescan of my mbox files understandably takes a little while. I
don't mind so much (my setup is abusive to mail readers, yes), but I
think I'd notice if it was doing it. :)
I need to move to the source version before bugging the list too much
more about these. Alas, I haven't sat down and figured out why the
build system wants a spellcheck library it shouldn't (and which I
already installed), and how to chop that out.
A ./configure --without-spellcheck would be nice...
Rob
#!/usr/bin/python
import sys
def readinbox(inbox, handle):
headers=None
body=None
msgs={}
msglist=[]
count=0
for i in inbox:
# Start of new message
if i.startswith("From "):
if body: handle(headers, body, count)
headers=[i]
body=None
count=count+1
elif not headers: continue
# Accumulate headers and body lines
elif body!=None: body.append(i)
elif i!='\n':
if headers and i[:1].isspace(): headers[-1]=headers[-1]+i
else: headers.append(i)
# Switch from headers to body
else:
hdrstr="".join(headers)
# Discard duplicates (even List-id: must be the same)
if hdrstr in msgs:
headers=None
print "dup %s@%s" % (msgs[hdrstr],count)
else:
msgs[hdrstr]=count
body=[]
if body: handle(headers, body, count)
print
return msgs,msglist
rules=[
("linux-kernel.vger.kernel.org", "linux/linux-kernel"),
("blfs-dev.linuxfromscratch.org", "lfs/blfs"),
("clfs-dev-cross-lfs.org", "lfs/clfs-dev"),
("clfs-support-cross-lfs.org", "lfs/clfs-support"),
("celinux-dev.lists.celinuxforum.org", "linux/celinux-dev"),
("toybox-landley.net", "mine/toybox"),
("users.linux.kernel.org", "linux/users"),
("crossgcc.sourceware.org", "lfs/crossgcc"),
("devicetree-discuss.lists.ozlabs.org", "linux/devicetree"),
("buildroot.busybox.net", "package/buildroot"),
("busybox.busybox.net", "package/busybox"),
("uclibc.uclibc.org", "package/uclibc"),
("aboriginal-landley.net", "mine/aboriginal"),
("gentoo-embedded.gentoo.org", "lfs/gentoo-embedded"),
("lfs-dev.linuxfromscratch.org", "lfs/lfs-dev"),
("lfs-support.linuxfromscratch.org", "lfs/lfs-support"),
("containers.lists.linux-foundation.org", "linux/containers"),
("linux-doc.vger.kernel.org", "linux/linux-doc"),
("linux-embedded.vger.kernel.org", "linux/linux-embedded"),
("staff.texas.lonestarcon3.org", "zzz/lonestarcon"),
("lxc-devel.lists.sourceforge.net", "linux/lxc-devel"),
("lxc-users.lists.sourceforge.net", "linux/lxc-users"),
("neuros.googlegroups.com", "zzz/neuros"),
("qemu-devel.nongnu.org", "package/qemu-devel"),
("dropbear.ucc.asn.au", "package/dropbear"),
("user-mode-linux-devel.lists.sourceforge.net", "zzz/uml"),
("tinycc-devel.nongnu.org", "package/tcc"),
("tsgeeks.list.gerf.org", "zzz/tsgeeks"),
("v9fs-developer.lists.sourceforge.net", "linux/v9fs-developer"),
("v9fs-users.lists.sourceforge.net", "linux/v9fs-users"),
("mercurial.selenic.com", "package/mercurial"),
("owner-pcc-list ludd ltu se", "sender", "package/pcc")
]
def write_outbox(headers, body, count):
output=None
for i in headers:
i=i.split(":", 1)
if len(i)==2:
low=i[0].lower()
for j in rules:
if len(j)==2:
if low != "list-id": continue
elif low != j[1]: continue
if i[1].find(j[0])!=-1:
output="mail/"+j[-1]
break
if output: break
if not output: output="mail/filtered"
print "write %s to %s" % (count, output)
open(output, "aw").write("%s\n%s" % ("".join(headers), "".join(body)))
def stub(one, two, three):
sys.stdout.write("pass %s\r" % three)
sys.stdout.flush()
pass
if __name__ == "__main__":
msgs, msglist=readinbox(open("mail/inbox", "r"), write_outbox)
#msgs, msglist=readinbox(open("mail/inbox", "r"), stub)
[Date Prev][
Date Next] [Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]