00:00:17 --
00:00:22 class html_elements_parser(SGMLParser):
00:00:22 def reset(self):
00:00:22 self.pieces = []
00:00:22 SGMLParser.reset(self)
00:00:22 def unknown_starttag(self, tag, attrs):
00:00:23 self.pieces.append("<%(tag)s>" % locals())
00:00:25 def unknown_endtag(self, tag):
00:00:27 self.pieces.append("%(tag)s>" % locals())
00:00:29 --
00:01:31 Hmm...
00:03:09 no, I don't trust it. It doesn't seem to handle comments correctly
00:03:44 that's cos it doesn't handle comments
00:03:57 def handel_comment(self, text): foo
00:03:59 handle
00:04:03 handle_comment(comment)
00:04:14 I mean, that function doesn't handle comments correctly
00:04:25 """For example, the comment "" will cause this method to be called with the argument 'text'."""
00:05:49 the XML Recommendation clearly states: [15] Comment ::= ''
00:06:55 well, subclass and modify the methods to your heart's content
00:07:03 use that inheritance power!
00:08:06 what about htmlparser ?
00:08:09 the new one?
00:08:21 I might use that
00:08:38 but I'm comfortable with RegExps
00:08:43 the new one's not based on the sgmlparser, and at a guess is more compliant
00:09:23 yeah
00:09:34 * sbp used it for browser.py
00:12:22 000552Z I mean, that function doesn't handle comments correctly
00:12:27 why do you say that?
00:12:33 hmm... are you looking for an html parser?
00:13:34 it thinks that anything that matches r'' is a comment. That's not true - it may be invalid
00:14:10 what function are you referring to?
00:16:27 [[[
00:16:43 ]]]
00:16:44 * AaronSw waves
00:16:44 [[[
00:17:41 man, that t3 was fast.
00:18:05 i clicked those links on audiogalaxy and turned around and the songs were done downloading
00:19:14 handle_comment(comment)
00:19:14 This method is called when a comment is encountered. The comment argument is a string containing the text between the "" delimiters, but not the delimiters themselves. For example, the comment "" will cause this method to be called with the argument 'text'. The default method does nothing.
00:19:14 ]]] - Python 2.2 manual, \Doc\lib\module-sgmllib.html
00:19:15 it'd still work if I used it, but I don't want to use a module that thinks that "" is a comment
00:19:17 and the RegExps work rather well, IMO
00:20:26 oh, that's just a bug in the TeX source:
00:20:27 themselves. For example, the comment \samp{} will
00:20:27 cause this method to be called with the argument \code{'text'}. The
00:20:50 -- means en dash to TeX
00:21:01 ah...
00:21:11 and tex2html makes it into hyphen-minus
00:21:32 what we'd do without a genius like deltab around
00:21:42 indeed
00:22:01 * tav sticks tongue out at sbp, so my method is still valid, and is elegant ;p
00:22:25 your method is valid but many pages are not
00:22:26 well, what can I say? I'm just a big RegExp-loving fool
00:22:37 yeah, that's the problem
00:22:37 * AaronSw giggles at deltab's comment
00:23:08 for every problem their is a solution that is elegant, valid and unusable due to the mistakes of others.
00:24:03 that's a really intelligent sounding bunch of lies
00:24:17 heh, yeah. they're called jokes
00:24:27 "Hello. My name is [deleted] and I’m the Customer Service Manager
00:24:28 at Crucial Technology. [...] we want to make sure we fully address all of our customers’ needs."
00:24:28 They could start by learning where commas go.
00:25:22 and apostrophes
00:25:29 ehm, where the commas?
00:25:40 yeah, well that's one place where commas go: not where apostrophes belong.
00:26:13 * deltab sees no commas or apostrophes
00:26:21 WTF are you talking about, Aaron?
00:26:23 what are you on about AaronSw?
00:26:25 heh
00:26:30 hmm?
00:26:45 i reckon he's lost it
00:26:49 I see: "[...] I,m the Customer Service Manager [...] al of our customers, needs"
00:26:59 s/al /all /
00:27:10 heh, we saw different
00:27:14 strange
00:27:15 002605Z "Hello. My name is [deleted] and Im the Customer
00:27:15 We see apostrophies in their proper places. Your client sucks
00:27:27 deltab doesn't seem to have
00:27:29 no, Microsoft sucks
00:27:34 * sbp tries the logs
00:27:35 yeah
00:27:36 deltab: i saw "I'm"
00:27:42 must be windows suckiness
00:27:43 * sbp too
00:27:48 tav: you're using a Microsoft OS
00:28:22 Microsoft employees never learned their grammar. See also: Word's Grammar checker, numerous examples on google.
00:28:32 I guess they think ,s and 's are the same.
00:28:38 tav` didn't see most of what AaronSw pasted
00:28:51 AaronSw: yea
00:28:52 * AaronSw apologizes to the Microsoft employees who do know grammar
00:28:58 sbp has quit (Killed (NickServ (Ghost: SeanP!~sean@m171-mp1-cvx3b.pop.ntl.com)))
00:29:01 AaronSw: what you pasted included C1 control characters
00:29:03 try spell checking 'esp worldwide ltd' and then grammar checking it
00:29:09 heh
00:29:15 deltab, aha.
00:29:17 sbp (~sean@m171-mp1-cvx3b.pop.ntl.com) has joined #swhack
00:30:06 AaronSw: Windows renders them as apostrophes, your software as commas
00:30:31 I believe I've seen some Linux software render them as commas too.
00:30:35 the logs show it correctly: http://blogspace.com/swhack/chatlogs/2002-01-22.txt
00:30:43 or incorrectly
00:31:02 half full/half empty
00:31:13 heh
00:31:23 well sbp and i and the logs see them correctly
00:31:27 obviously AaronSw is wrong
00:31:28 odd... in the logs i see like a backwards `
00:31:33 * sbp ^5's tav
00:31:45 sbp and tav use Windows, so obviously they're wrong too.
00:31:56 I see
00:31:57 I see an apostrophie, but when I paste it into CygWin, it shows a backwards `
00:32:06 ? neat
00:32:14 aah, i18n
00:32:19 heh
00:32:33 i hate how irc descends into this
00:32:49 * tav goes back to his todo
00:32:51 i think this only happens with geeks
00:33:20 true
00:33:26 yes, because everyone else uses Windows
00:33:40 http://www.fourmilab.ch/webtools/demoroniser/
00:33:44 Note how Crucial followed up, and has a site that's a joy to use (most of the time), whereas Dell did not and has a super-sucky site.
00:34:32 speaking of fourmilab, fermilab is quite nearby. good place to go for birthday parties -- kids love atom accelerators
00:34:40 AaronSw: perhaps you could ask them to set their mail client to send only ASCII
00:34:57 At least it wasn't in HTML.
00:35:41 well, thank them for that
00:35:59 people should use text on the Web and HTML in emails. We'd all be... um... so much better off
00:36:14 "Dear [deleted], it appears your mail client, 42B5ACAC.132FCE88.e50eb3888bf3a3330156333d41a337a4, is sending proprietary Microsoft character."
00:36:42 you need an "a" or a plural in there
00:36:54 sending a proprietary Microsoft character/sending proprietary Microsoft characters
00:36:54 What an odd name for a mail client.
00:37:09 Yeah. Why not call it "Fred"?
00:37:20 perhaps they hashed the name or something
00:37:39 no, that's probably the version :-)
00:38:29 AaronSw: where did you get that from?
00:38:43 John: Sue, what version of Microsoft Outlook Express Mail Electronic Sender System are you using?
00:38:44 Sue: Well, I'm using 42B5ACAC.132FCE88.e50eb3888bf3a3330156333d41a337a4, of course.
00:38:44 John: What? You haven't upgraded to 42B5ACAC.132FCE88.e50eb3888bf3a3330166333d41a337a4?!
00:38:45 --
00:38:50 deltab, the X-Mailer header.
00:40:34 sbp has quit (Read error: 104 (Connection reset by peer))
00:40:44 sbp (~sean@m171-mp1-cvx3b.pop.ntl.com) has joined #swhack
00:40:50 deltab, tidy with the -fix-mshtml-2000 option will do something like what the demoroniser appears to
00:42:45 oops, actually its word-2000: yes
00:43:07 I like how Tidy says: "This looks like HTML proprietary."
00:43:19 :Tidy a :Decrapulator .
00:44:11 I got Bryan Bell to run his designs thru Tidy. I'm quite excited. He wanted to use PNG and CSS, but he says that he's worried about browsers that don't support them.
00:44:22 He also says that Manila won't let him put in ALT tags.
00:44:30 ok, gotta run. dinner
00:46:03 that X-Mailer thing looks like a UUID
00:46:15 the last part
00:46:35 the first parts might be timestamp and IP address
00:48:31 1: "Hey fellas, get on board the Brad fad"
00:48:36 2: "What's that?"
00:48:41 1: "the fad of Brad"
00:48:47 3: "Er... right"
00:49:02 1: "Don't delay!"
00:49:50 water always acts so odd when it's hot
00:53:34 crap! I just remembered that I used to write plays
00:53:46 argh, another supressed memory surfaces
00:54:43 as long as I didn't walk around the stage clapping my hands and going, "come on people", I think I'm in the clear
01:11:45 * sbp rethinks EARL
01:11:54 what are we using it for?
01:13:21 I feel a bit guilty having been in a group that's been working for so long on a language that no-one actually uses yet
01:15:40 * sbp reads http://www.w3.org/WAI/ER/2001/10/f2f-notes#tools
01:16:10 the WCAG test case is important
01:29:57 lol! @ the plays
01:30:47 I directed a rendition of famous some poetry thing
01:30:56 I forget the name now, but it was famous, so you should know
01:31:36 I feel a bit guilty having been in a group that's been working for so long on a language that no-one actually uses yet
01:31:36 How do you think RDF Core feels?
01:33:22 Hmm, sometimes I catch myself singing songs from "Stan Freeberg modestly presents The United States Of America"
01:33:39 well, more like humming
01:38:43 * sbp returns
01:39:07 "Aaron Swartz, director" - sounds pretty good
01:39:14 RDF Core: lol!
01:40:32 perhaps if we made EARL non-RDF, it'd start "working"
01:41:31 * sbp is pretty sure that EARL would work, if used for anything... which is why I'm working on that API again
01:46:06 * AaronSw sets up procmail scripts to filter yahoogroups messages
01:46:40 you and your new toy
01:46:50 hee
01:46:55 it's fun
01:52:11 hmm, i see a recommendation of maildrop
01:54:27 this looks useful: http://bdg.centrin.net.id/~budsan02/mailfilter.htm
01:55:25 aaaaargh:-
01:55:25 [[[
01:55:26 File "earlapi.py", line 73, in test
01:55:26 parser = xml.sax.make_parser()
01:55:26 File "/usr/lib/python2.2/xml/sax/__init__.py", line 93, in make_parser
01:55:26 raise SAXReaderNotAvailable("No parsers found", None)
01:55:28 xml.sax._exceptions.SAXReaderNotAvailable: No parsers found
01:55:30 ]]]
01:55:37 heh, doom!
01:55:38 gah
01:55:49 Notavailable
01:55:53 fink install python2.2-xml # shoulda used debian
01:56:10 lol @ No tav available
01:56:24 ailable
01:56:27 .wn ailable
01:56:34 .dict ailable
01:56:34 http://work.ucsd.edu:5141/cgi-bin/http_webster?ailable
01:56:35 SAX ails tav
01:56:42 i guess it makes his client go beepy
01:56:48 yea
01:56:53 i got a double beep
01:57:01 * tav adds to his ignore list
01:58:23 argh, damn SAX
01:59:04 tav needs to expand his vocabulary. learn new words like:
01:59:05 all' ottava
01:59:06 atavism
01:59:06 atavisms
01:59:06 atavistic
01:59:06 atavistically
01:59:07 Batavia
01:59:09 Batavian
01:59:11 Batavian Republic
01:59:13 centavo
01:59:15 centavos
01:59:21 Civitavecchia
01:59:22 contraoctave
01:59:22 contraoctaves
01:59:23 cowlstaves
01:59:25 four-line octave
01:59:27 great octave
01:59:29 Gustav
01:59:31 Gustavo
01:59:33 Gustavo A Madero, Villa
01:59:35 Gustavus
01:59:37 hantavirus
01:59:39 pentavalent
01:59:41 Poltava
01:59:43 quarterstaves
01:59:46 rotavirus
01:59:47 rotaviruses
01:59:49 Stavanger
01:59:51 stave
01:59:53 staved
01:59:57 stave off
01:59:59 staves
02:00:01 stavesacre
02:00:03 stavesacres
02:00:05 staving
02:00:07 Stavropol'
02:00:09 Tamatave
02:00:11 tavern
02:00:13 taverna
02:00:15 tavernas
02:00:17 taverner
02:00:19 taverners
02:00:21 taverns
02:00:23 tipstaves
02:00:25