grep doesn't handle \w correctly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
grep (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Put the following text into a file called testfile.txt:
-------
if ($this->_touchOnly) //log if touchOnly
only log the output of the controller if $silent
if (window.console) console.log(data);
ERunActions:
self::logTrace(
self::logError(
-------end of testfile.txt
Try this command:
$ grep -Re "[^\w]log[^\w]" testfile.txt
Expected result:
Should find matches only in lines 1,2,3, gighlighting the word "log"
Observed result:
Finds also matches in lines 5,6,7, highlighting the strings "logO", "logT" and "logE" respectively.
According to the man page:
The symbol \w is a synonym for [_[:alnum:]]
And in turn:
For example, [[:alnum:]] means the character class of numbers and
letters in the current locale. In the C locale and ASCII character set
encoding, this is the same as [0-9A-Za-z].
It looks like, instead, \w is interpreted as only numbers and LOWERCASE letters, but not uppercase letters.
I know it seems unbelievable to be finding a bug in grep in 2013, but compare to
http://
which shows the expected result
ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: grep 2.12-2
ProcVersionSign
Uname: Linux 3.5.0-27-generic i686
NonfreeKernelMo
ApportVersion: 2.6.1-0ubuntu10
Architecture: i386
Date: Sat Apr 6 22:27:39 2013
InstallationDate: Installed on 2010-06-23 (1018 days ago)
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
MarkForUpload: True
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
XDG_RUNTIME_
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: grep
UpgradeStatus: Upgraded to quantal on 2013-01-13 (83 days ago)