Commit Graph

7 Commits

Author SHA1 Message Date
Viktor Szakats
650b33a3db
badwords: pass config as filename arg
Instead of stdin.

To simplify the command-line, and allow using a safe and portable
`system()` call from `badwords-all`.

Ref: https://perldoc.perl.org/functions/system

Closes #20970
2026-03-18 11:22:23 +01:00
Daniel Stenberg
6870803187
badwords: only check comments and strings in source code
- when scanning source code, this now only checks source code comments
  and double-quote strings. No more finding bad words as part of code
- this allows the full scan to be done in a single invocation
- detects source code or markdown by file name extension
- moved the whitelist words config into the single `badwords.txt` file,
  no more having them separately (see top of file for syntax)
- all whitelisted words are checked case insensitively now
- removed support for whitelisting words on a specific line number. We
  did not use it and it is too fragile

Removing the actual code from getting scanned made the script take an
additional 0.5 seconds on my machine.

Scanning 1525 files now takes a little under 1.7 seconds for me.

Closes #20909
2026-03-13 08:54:35 +01:00
Viktor Szakats
435eabeac8
badwords: rework exceptions, fix many of them
Also:
- support per-directory and per-upper-directory whitelist entries.
- convert badlist input grep tweak into the above format.
  (except for 'And' which had just a few hits.)
- fix many code exceptions, but do not enforce.
  (there also remain about 350 'will' uses in lib)
- fix badwords in example code, drop exceptions.
- badwords-all: convert to Perl.
  To make it usable from CMake.
- FAQ: reword to not use 'will'. Drop exception.

Closes #20886
2026-03-12 01:01:16 +01:00
Daniel Stenberg
2e52a57107
badwords: combine the whitelisting into a single regex
Also: make the whitelist matches case insensitve

Takes the script execution time down from 3.6 seconds to 1.1 on my
machine.

Closes #20880
2026-03-11 08:45:54 +01:00
Viktor Szakats
4021c6e673
badwords: fix showing alternative for case-insensitive hits
Fixing:
```
Use of uninitialized value $alt{"Simply"} in printf at scripts/badwords line 109, <F> line 34.
 maybe use "" instead?
```

Closes #20879
2026-03-10 18:38:29 +01:00
Stefan Eissing
c1cea52f12
badwords: twice as fast
...on my macOS machine, this version uses half the time when
scanning the source.

Closes #20877
2026-03-10 16:07:15 +01:00
Daniel Stenberg
713287188e
badwords: move into ./scripts, speed up
- 'badwords' is now a target in Makefile.am

- change badwords.txt to specify plain "words" instead of regexes so the
  script can build single regexes when scanning, which makes the script
  perform much faster (~6 times faster)

Closes #20869
2026-03-09 22:47:07 +01:00