From eikemeier@fillmore-labs.com Thu Jun 19 20:37:45 2003 Return-Path: Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8A86937B401 for ; Thu, 19 Jun 2003 20:37:45 -0700 (PDT) Received: from mx2.fillmore-labs.com (lima.fillmore-labs.com [62.138.193.83]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3644343F93 for ; Thu, 19 Jun 2003 20:37:44 -0700 (PDT) (envelope-from eikemeier@fillmore-labs.com) Received: from pd958a8e9.dip.t-dialin.net ([217.88.168.233] helo=fillmore-labs.com ident=c2v23lyudvwyhlku) by mx2.fillmore-labs.com with asmtp (TLSv1:AES256-SHA:256) (Exim 4.20) id 19TCic-000284-3F for FreeBSD-gnats-submit@freebsd.org; Fri, 20 Jun 2003 05:37:42 +0200 Message-Id: <3EF2817D.60703@fillmore-labs.com> Date: Fri, 20 Jun 2003 05:37:33 +0200 From: Oliver Eikemeier Reply-To: Oliver Eikemeier To: FreeBSD-gnats-submit@freebsd.org Subject: [PATCH] query-pr.cgi doesn't work with urls enclosed in "<>" or containing a "&". >Number: 53530 >Category: www >Synopsis: [PATCH] query-pr.cgi doesn't work with urls enclosed in "<>" or containing a "&". >Confidential: no >Severity: non-critical >Priority: medium >Responsible: ceri >State: closed >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Jun 19 20:40:11 PDT 2003 >Closed-Date: Wed Nov 12 12:59:36 PST 2003 >Last-Modified: Wed Nov 12 12:59:36 PST 2003 >Originator: Oliver Eikemeier >Release: FreeBSD 4.8-STABLE i386 >Organization: Fillmore Labs - http://www.fillmore-labs.com >Environment: System: FreeBSD nuuk.fillmore-labs.com 4.8-STABLE >Description: query-pr.cgi does not work with links that are enclosed in "<" and ">" (which is fairly common) and links that contain an ampersand ("&"). >How-To-Repeat: See for example PR www/48575 or numerous others, like: fixline in query-pr.cgi is broken, try the following excerpt: #!/usr/bin/perl sub srcref { return shift; } sub fixline { local($line) = shift; $line =~ s/&/&/g; $line =~ s//>/g; $line =~ s%((https?|ftp)://[^\s"\)\>,;]+)%$1%gi; $line =~ s%(\WPR[:s# \t]+)([a-z3486]+\/)?([0-9]+)%$1$2$3%ig; return &srcref($line); } sub newfixline { local(@splitline) = split(/((?:https?|ftp):\/\/[^\s"\(\)<>,;]+)/, shift); local($isurl) = 0; foreach (@splitline) { if ($isurl) { local($href) = local($html) = $_; $href =~ s/&/%26/g; $html =~ s/&/&/g; $_ = "$html"; } else { s/&/&/g; s//>/g; s%(\WPR[:s# \t]+)([a-z3486]+\/)?([0-9]+)%$1$2$3%ig; } $isurl = ! $isurl; } return &srcref(join('', @splitline)); } @urls = ( '', 'http://www.freebsd.org/cgi/query-pr-summary.cgi?multitext=query-pr&sort=lastmod' ); foreach(@urls) { print "Original: ", $_, "\n"; print "Old: ", fixline ($_), "\n"; print "New: ", newfixline ($_), "\n"; print "\n"; } Its output: Original: Old: <http://www.freebsd.org/>; New: <http://www.freebsd.org/> Original: http://www.freebsd.org/cgi/query-pr-summary.cgi?multitext=query-pr&so\rt=lastmod Old: http://www.freebsd.org/cgi/query-pr-summary.cgi?multitext=query-pr&;so\rt=lastmod New: http://www.freebsd.org/cgi/query-pr-summary.cgi?multitext=query-pr&so\rt=lastmod >Fix: HTML quoting has to be different in HTML text and links. The following patch replaces fixline with code that splits a line in alternating non-url and url parts and treats them differently. The patch tries to mimic the pre-perl5.005 approach of query-pr.cgi, which is probably not a good idea. query-pr.cgi should be rewritten, but I do not have the right testing infrastructure. So be it: --- query-pr.cgi.patch begins here --- --- query-pr.cgi.orig Mon Jun 9 16:58:00 2003 +++ query-pr.cgi Fri Jun 20 04:52:47 2003 @@ -219,13 +219,23 @@ } sub fixline { - local($line) = shift; - - $line =~ s/&/&/g; - $line =~ s//>/g; - $line =~ s%((http|ftp)://[^\s"\)\>,;]+)%$1%gi; - $line =~ s%(\WPR[:s# \t]+)([a-z3486]+\/)?([0-9]+)%$1$2$3%ig; - - return &srcref($line); + local(@splitline) = split(/((?:https?|ftp):\/\/[^\s"\(\)<>,;]+)/, shift); + + local($isurl) = 0; + foreach (@splitline) { + if ($isurl) { + local($href) = local($html) = $_; + $href =~ s/&/%26/g; + $html =~ s/&/&/g; + $_ = "$html"; + } else { + s/&/&/g; + s//>/g; + s%(\WPR[:s# \t]+)([a-z3486]+\/)?([0-9]+)%$1$2$3%ig; + } + $isurl = ! $isurl; + } + + return &srcref(join('', @splitline)); } --- query-pr.cgi.patch ends here --- >Release-Note: >Audit-Trail: Responsible-Changed-From-To: freebsd-www->ceri Responsible-Changed-By: ceri Responsible-Changed-When: Fri Jun 20 02:05:02 PDT 2003 Responsible-Changed-Why: I'm working on something very similar from www/51607. http://www.freebsd.org/cgi/query-pr.cgi?pr=53530 State-Changed-From-To: open->closed State-Changed-By: ceri State-Changed-When: Wed Nov 12 12:59:13 PST 2003 State-Changed-Why: Committed in r1.36 of www/en/cgi/query-pr.cgi; thanks! http://www.freebsd.org/cgi/query-pr.cgi?pr=53530 >Unformatted: