|
How to do things AI Noob vs. Pro
List biggest files Free Open Source: Swiss File Knifea command line
Depeche View
command line
free external tools,
cpp sources
articles |
Replace text in files on the command line using wildcards and Simple Expressions with Swiss File Knife XE for Windows, Mac OS X and Linux.
sfk xreplace dirName "/searchtext/totext/"
replace in text and binary files using wildcards * and ?
as well as SFK Simple Expressions in brackets [].
Multiple search patterns are executed in the given sequence. Mind this
if they overlap, e.g. /foo/bar/ /foosys/thesys/ makes no sense (foo is
replaced by the first expression, so the 2nd one will fail to match).
by default, replace functions run in SIMULATION mode,
previewing hits without changing anything. add -yes to apply changes.
Changing binaries may lead to unpredictable results, therefore keep
backups of your files in any case.
subdirectories are included by default
the sfk default for most commands is to process the given directories,
as well as all subdirs within them. specify -nosub to disable this.
options
-nosub do not include files in subdirectories.
-nobin[ary] skip binary files.
-case case-sensitive text comparison. default is insensitive.
for details type: sfk help nocase
-pat starts a list of search or replace patterns of the form
xsrcxdstx where x is the separator char, src the source
to search for, and dst the destination to replace it with.
e.g. /foo/bar/ or _foo_bar_ both replace foo by bar.
-pat is not required if a single filename is given.
-text the same as -pat, starting a text pattern list.
-bylist x.txt read search patterns from a file x.txt, supporting
multiple lines per pattern. (add -full for more.)
-bylinelist x read /from/to/ or just /from/ patterns from a file x
with one pattern per line. (add -full for more.)
-by(line)list does not support sfk variables.
to use variables in patterns create an sfk script
with patterns as parameters. "sfk script" for more.
-usetmp allow creation of temporary files if output data is larger
than the memory limit (default: 300 MB). without -usetmp,
SFK uses the whole RAM, but stops with an error if it runs
out of memory.
-memlimit=n use up to n mbytes of RAM to store output data, and when
the limit is reached, use a temporary file. this option
implies -usetmp. to set this permanently by environment,
type in a batch file: set SFK_CONFIG=memlimit:n
-tmpdir x set directory x as temporary file directory. default is
to use the path specified by TEMP or TMP env variable.
-showtmp tell verbosely which temporary files are created.
SFK temporary filenames contain the process ID
to make sure multiple SFK running in parallel
do not use the same temporary file.
-notmp never create temporary files (default). if combined with
-memlimit, sfk stops with an error if memlimit is reached.
-recsize set input record size for processing (default=100k).
xreplace, xfind and xhexfind extend this automatically
based on the largest search patterns.
-firsthit process only first found pattern match per file.
-quiet do not show progress infos.
-stat show statistics like hits per pattern and no. of files.
-perf show performance statistics.
-full print full help text telling about -bylist pattern files,
special character case sensitivity and nested or repeated
replace behaviour.
output options
-dump create hexdump of search hits or replaced text.
-wide with -dump: show 16 bytes per line.
-lean with -dump: show 8 bytes per line.
-dumpfrom always dump search hits but not replaced text.
-dumpall dump search text and replaced text.
-nodump do not create a hexdump, list only matching files.
-astext no hexdump, but print search hits as plain text.
use this only with plain text files, not binary.
-showle highlight CR/LF line endings in hex dump output
-context=n with hexdump: show additional n bytes of context.
-reldist with hexdump: tell relative distances to previous hits.
-to dir\$file write output files to given path. for details about
output file masks, type "sfk help opt" or "sfk run".
-tofile x write output data to a single output filename x
(which is not interpreted as a mask but taken as is).
-more[n] pause output every 30 or n lines.
-showhits list matching and missing search patterns.
-showjusthit or -showmiss lists only matching or missing patterns.
return codes for batch files
0 = no matches, 1 = matches found, >1 = major error occurred.
see also "sfk help opt" on how to influence error processing.
temporary files
with option -usetmp or -memlimit sfk may create temporary files
in a folder specified by TEMP or TMP environment variable,
or within /tmp under Linux, or in a folder given by -tmpdir
or from an SFK_CONFIG=tmpdir:... setting. type "sfk help opt"
for further infos.
unexpected repeat replace behaviour
depending on the input data and search/replace expressions,
it can happen that running the same replace multiple times
on the same file produces further hits that didn't exist
in the first run. add option -full to read more on this.
quoted multi line parameters are supported in scripts
using full trim. type "sfk script" for details.
wildcards and SFK expressions
SFK Expressions are simple patterns containing literal text,
wildcards * and ? and character classes in square brackets [].
basically, the syntax provides extended wilcards but no
further logic and is not related to regular expressions.
search patterns are surrounded by a separator character which
can be anything not contained in the search text, like / or _
within a pattern /fromtext/totext/ the fromtext may contain:
* - 0 to 4000 characters in the same
text line or paragraph, i.e. all
bytes not being CR, LF or NULL.
4000 is just a default maximum
that can be changed by:
[0.100000 chars] - 0 to 100000 characters in the same
text line or paragraph, i.e. the
same as * but with a larger range.
? - one character.
????? - same as [5.5 chars] or [5 chars]
[bytes] - 0 to 4000 bytes (with CR,LF,NULL)
i.e. it collects stream text
across lines, even in binary data
** - the same as [bytes].
[0.100 bytes] - 0 to 100 bytes
[.100000 bytes] - up to 100000 bytes
[1.* bytes] - 1 to default maximum bytes
[2 chars] - exactly 2 chars
[30 bytes] - exactly 30 bytes
[byte of aeiou] - one vocal (a OR A OR e OR ...),
case insensitive by default.
"aeiou" is a character list.
[byte of \\\x2f] - a backslash \ or forw. slash /
[bytes of \r\n \t] - whitespace incl. line ends
[bytes of (\r\n \t)] - the same, () are optional
[bytes not \r\n\0] - up to 4000 bytes as long as no
CR, LF or NULL byte appears
[chars] - the same as [bytes not \r\n\0],
i.e. collect text in a line
[char not ( \t)] - same as [byte not ( \r\n\0\t)],
everything not blanks and tabs
[char not )( \t] - not brackets, blanks and tabs,
same as not (\(\) \t)
[chars of a-z0-9] - means a-zA-Z0-9 as search is
case insensitive by default
[chars of \x61-\x7A] - search a-z but not A-Z, or use
option -case for case search
[eol] - end of line by characters:
CRLF or LF or CR
[white] = chars of (\t ) - 0 or more whitespaces
[xwhite] = bytes of (\t \r\n) - same but across lines
[1 white] = byte of (\t ) - 1 whitespace
[digit] = byte of (0-9) - 1 digit
[digits] = bytes of (0-9) - 0 or more digits
[hexdigit] = byte of (0-9a-f) - 1 hexadecimal digit
[hexdigits] = bytes of (0-9a-f) - 0 or more hex digits
special keywords that do not count as tokens:
[skip] - at the start of a pattern: skip such text
completely, do not count it as a search hit.
[keep] - search also the following text but keep it
in the input data, without consuming it.
[ortext] - foo[ortext]bar searches word foo or bar.
[ortext] is allowed only between literals.
anchors that have no length of their own:
[start] - start of file
[end] - end of file
[lstart] - line start, i.e. start or CRLF or CR or LF
[lend] - logical line end, i.e. eol or end of file.
to replace line ends use [eol] instead.
how to search or replace special characters:
- to search or replace text containing the literal characters
* ? \ [ ] then these must be escaped like \* \? \\ \[ \]
- ( ) are escaped only within character lists, like \( \)
- to search or replace the forward slash '/' type \x2f or use
another char around from/to text, e.g. _fromtext_totext_
- parameters with blanks and non trivial characters need double
quotes "", see also "about Shell Command Characters" below.
expansion priorities: (highest first)
if two search parts are side by side, and the same input
character matches both, then these priorities apply:
5: start, end, lstart, lend
4: literal text, eol
3: whitelist classes: byte of, bytes of
2: blacklist classes: chars not, bytes not
1: plain wildcards: ?, *, **, byte, bytes, chars
this means in "/[bytes]foo/" the [bytes] will stop to collect
characters as soon as "foo" is found, as "foo" is a literal.
on same or higher priority the right side stops the left side.
avoid overlapping character groups. for example, [chars][white]
cannot work, as space and tab are part of chars. to fix this
extend chars by relevant exclusions: [chars not ( \t)][white]
the totext may contain:
[part 1] use first text part of the fromtext.
e.g. the fromtext /*foo[.100 chars]bar*/
contains parts : 1 2 3 4 5
[part1] the same (blank is optional).
[parts 1,2,3] use parts 1, 2 and 3.
[parts 1-10] use parts 1 to 10.
[strip(part1,\0)] use part 1 but remove zero bytes.
only zero bytes "\0" can be removed.
[file.name] full input filename with path
[file.relname] input filename without path
[file.path] input file's path
[file.base] relname without last .extension
[file.ext] input filename extension
[all] use all parts from fromtext.
[setvar name]...[endvar] set variable "name" with data
between setvar and endvar.
[getvar name] fill in data from variable "name"
although anchors like lstart, lend count as a separate part
they need NOT be specified in the totext. this means that
/[lstart]foo[lend]/bar/ just changes the word "foo".
if replace looses line endings in output
- when using [eol] in most cases you should add [part...]
to the output pattern, to copy the actual found line
separators, or line endings may get lost.
supported slash patterns
\t = TAB
\r = CR
\n = LF
\x00 = one byte with code 00 hexadecimal
\0 = short form for \x00
\q = a double quote "
\\ = the backslash character \ itself
\[ = the bracket open character [
\] = the bracket close character ]
\* = the literal star character *
\? = the literal question mark ?
\- = to use literal "-" in a command
Within multi line -bylist files:
\ = slash+blank is changed to a single blank
Only within "char of" or "byte not" lists:
\( = to use literal character "("
\) = to use literal character ")"
SFK expression options
-showpart(s) print /from/ part numbers, range statistics
and expansion priority points per part.
done automatically if a required /to/ text
is not given with a command.
-showbest if a /from/ pattern finds nothing, use this to
see how many parts would match so far, and with
up to how many bytes per part. anchors like [lstart]
may show a non zero length when matching (CR)LF.
-showlist with -bylist, show the internal joined list if
commands are spread across multiple lines.
-showall show all of the above.
-xmaxlen=n set default maximum length for chars or bytes commands,
e.g. -xmaxlen=10000 means /foo*bar/ matches with up to
10000 characters between foo and bar. the default max
length without this option is 4000 characters.
performance notes
- always use a string literal, or single byte or char, at the start
of your search expressions, like in /foo*bar/ starting with 'f'.
Do not use a wildcard like * at the start like in /*foobar/
when searching huge input data, as your search will slow down by
factor 256. Use /[lstart]*foobar/ instead.
- the system may cache output file(s), writing to disk in background
after sfk has finished. subsequent batch commands may execute slower.
office file support
sfk ofind search in .xml text file contents of
office files like .docx .xlsx .ods .odt.
sfk help office for more infos and options
see also
sfk xfind search wildcard text in plain text files
sfk ofind search in office files .docx .xlsx .ods
sfk xfindbin search wildcard text in text/binary files
sfk xhexfind search in text/binary with hex dump output
sfk extract extract wildcard data from text/binary files
sfk filter filter and edit text with simple wildcards
sfk find search fixed text in text files
sfk findbin search fixed text in text/binary files
sfk hexfind search fixed text in binary files
sfk replace replace fixed text in text/binary files
sfk view GUI tool to search text as you type
sfk replace replace fixed text with high performance
sfk xreplace replace wildcard text in text/binary files
beware of Shell Command Characters.
to find or replace text patterns containing spaces or special
characters like <>|!&?* you must add quotes "" around parameters
or the shell environment will destroy your command. for example,
pattern /foo bar/other/ must be written like "/foo bar/other/"
within a .bat or .cmd file the percent % must be escaped like %%
even within quotes: sfk echo -spat "percent %% is a percent \x25"
about example numbers with [brackets]
if you see [1] type "sfk cmd 1" for whole command in one line.
bad examples with corrections
if input text contains:
bool bClFoo;
bool bClBar ;
sfk xfind in.txt "/bool[xwhite]bCl*[xwhite];/"
does NOT match "bool bClFoo;" because * eats the
whole input line including ";" so no input is left
for "[xwhite];" and the whole expression fails.
sfk xfind in.txt "/bool[xwhite]bCl[* not ;][xwhite];/"
does both match "bool bClFoo;" and "bool bClBar ;".
this means whenever your search fails to work write
in detail which characters (not) to collect where.
sfk xex in.txt "/[lstart]foo/[lstart]goo/"
there is no need to write an anchor like [lstart]
within totext as it contains no data. use instead:
sfk xex in.txt "/[lstart]foo/goo/"
sfk xex in.txt "/foo[lend]bar/goo[part2]bar/"
anchors like [lend] must be at start or end of fromtext
and cannot be referenced within totext. use instead:
sfk xex in.txt "/foo[eol]bar/goo[part2]bar/"
working examples
sfk xrep mydir "/foo*bar/"
an incomplete command (missing "totext" part in pattern).
sfk shows an info text telling about part numbers
and runs a search for "foo*bar" in all files of mydir.
nothing is changed so far.
sfk xrep mydir "/foo*bar/[part1]goo[part3]/"
same as above, but now the /fromtext/totext/ is complete.
again sfk runs a search for "foo*bar", but now it displays
the changed output text (totext), with everything between
"foo" and "bar" being changed to "goo". add option
-dumpfrom to display the original found text instead.
sfk sel mydir .txt +xrep "/foo*bar/[part1]goo[part3]/"
similar to above, replace in all .txt files of mydir.
sfk xrep -text "/class* CFoo/[part1][part3]/" -dir mydir -file .hpp
search only .hpp files within mydir, and replace for example
"class IMPORT CFoo" by "class CFoo".
sfk xrep -pat "/[byte not \n][end]/[part1]\n/"
-dir mydir -file .cpp .hpp -dumpall
find all .cpp or .hpp files in mydir whose last line is not
ending with a linefeed, and add the linefeed. to check exactly
what is changed dump both input and output text. [23]
sfk xrep -dir mydir -file .hpp -enddir
-text "/[byte not \n][end]/[part1]\n/" -dumpall
same as above but with dir parameters first. [25]
sfk xrep io.txt "/[lstart][20 chars]*/[part3]/"
cut first 20 characters in every line of io.txt.
sfk xrep io.txt "/[lstart][9 bytes]1001*/[part2]9009[part4]/"
in fixed position text file data like:
rec. 001:5318 aef3 2751 1001
rec. 002:1001 aef5 275a 1001
rec. 003:ef49 aef7 2763 1001
replace "1001" where it appears in columns 10 to 13,
in this example only the first "1001" in record 2.
sfk xrep in.dat "/\xFF\xFE[1 byte]\x80\x81/\xFF\xFE\x00\x80\x81/"
replace byte sequences (not ASCII text strings) in binary data.
searches byte groups starting with values 0xFF 0xFE, then any
single byte, then 0x80 0x81, and replaces the variable byte
by always a binary 0x00 value.
sfk xreplace in.txt "/foo*bar/other/"
replace phrases starting with "foo" and ending with "bar"
by word "other" in single file in.txt
sfk xreplace -text "/foo*bar/===[part2]===/" -dir mydir -file .txt
replace foo*bar in all .txt files of folder mydir
with a new pattern containing the text between foo and bar
surrounded by "===".
sfk xrep -text "/\x66\x6f\x6f[0.100 bytes]\x62\x61\x72/---/"
-dir mydir -file .dat
replace binary data starting with bytes 0x66, 0x6f, 0x6f,
ending with 0x62, 0x61, 0x72 and up to 100 bytes inbetween
by "---" within all .dat files of folder mydir. [24]
sfk xreplace dirName "/searchtext/totext/"
replace in text and binary files using
wildcards * and ? as well as SFK Simple
Expressions in brackets [].
Multiple search patterns are executed in
the given sequence. Mind this if they
overlap, e.g. /foo/bar/ /foosys/thesys/
makes no sense (foo is replaced by the
first expression, so the 2nd one will fail
to match).
by default, replace functions run in
SIMULATION mode,
previewing hits without changing
anything. add -yes to apply changes.
Changing binaries may lead to
unpredictable results, therefore keep
backups of your files in any case.
subdirectories are included by default
the sfk default for most commands is to
process the given directories, as well
as all subdirs within them. specify
-nosub to disable this.
options
-nosub do not include files in
subdirectories.
-nobin[ary] skip binary files.
-case case-sensitive text
comparison. default is
insensitive. for details
type: sfk help nocase
-pat starts a list of search or
replace patterns of the
form xsrcxdstx where x is
the separator char, src
the source to search for,
and dst the destination to
replace it with. e.g. /foo/
bar/ or _foo_bar_ both
replace foo by bar. -pat
is not required if a
single filename is given.
-text the same as -pat, starting
a text pattern list.
-bylist x.txt read search patterns from
a file x.txt, supporting
multiple lines per pattern.
(add -full for more.)
-bylinelist x read /from/to/ or just
/from/ patterns from a file x
with one pattern per line.
(add -full for more.)
-by(line)list does not
support sfk variables. to
use variables in patterns
create an sfk script with
patterns as parameters.
"sfk script" for more.
-usetmp allow creation of
temporary files if output
data is larger than the
memory limit (default: 300
MB). without -usetmp, SFK
uses the whole RAM, but
stops with an error if it
runs out of memory.
-memlimit=n use up to n mbytes of RAM
to store output data, and
when the limit is reached,
use a temporary file. this
option implies -usetmp. to
set this permanently by
environment, type in a
batch file: set
SFK_CONFIG=memlimit:n
-tmpdir x set directory x as
temporary file directory.
default is to use the path
specified by TEMP or TMP
env variable.
-showtmp tell verbosely which
temporary files are
created. SFK temporary
filenames contain the
process ID to make sure
multiple SFK running in
parallel do not use the
same temporary file.
-notmp never create temporary
files (default). if
combined with -memlimit,
sfk stops with an error if
memlimit is reached.
-recsize set input record size for
processing (default=100k).
xreplace, xfind and
xhexfind extend this
automatically based on the
largest search patterns.
-firsthit process only first found
pattern match per file.
-quiet do not show progress infos.
-stat show statistics like hits
per pattern and no. of
files.
-perf show performance
statistics.
-full print full help text
telling about -bylist
pattern files, special
character case sensitivity
and nested or repeated
replace behaviour.
output options
-dump create hexdump of search
hits or replaced text.
-wide with -dump: show 16 bytes
per line.
-lean with -dump: show 8 bytes
per line.
-dumpfrom always dump search hits
but not replaced text.
-dumpall dump search text and
replaced text.
-nodump do not create a hexdump,
list only matching files.
-astext no hexdump, but print
search hits as plain text.
use this only with plain
text files, not binary.
-showle highlight CR/LF line
endings in hex dump output
-context=n with hexdump: show
additional n bytes of
context.
-reldist with hexdump: tell
relative distances to
previous hits.
-to dir\$file write output files to
given path. for details about
output file masks, type
"sfk help opt" or "sfk
run".
-tofile x write output data to a
single output filename x
(which is not interpreted
as a mask but taken as is).
-more[n] pause output every 30 or n
lines.
-showhits list matching and missing
search patterns.
-showjusthit or -showmiss lists only
matching or missing
patterns.
return codes for batch files
0 = no matches, 1 = matches found, >1
= major error occurred. see also "sfk
help opt" on how to influence error
processing.
temporary files
with option -usetmp or -memlimit sfk
may create temporary
files
in a folder specified by TEMP or TMP
environment variable, or within /tmp
under Linux, or in a folder given by
-tmpdir or from an SFK_CONFIG=tmpdir:...
setting. type "sfk help opt" for further
infos.
unexpected repeat replace behaviour
depending on the input data and
search/replace expressions, it can
happen that running the same replace
multiple times on the same file produces
further hits that didn't exist in the
first run. add option -full to read more
on this.
quoted multi line parameters are supported
in scripts
using full trim. type "sfk script" for
details.
wildcards and SFK expressions
SFK Expressions are simple patterns
containing literal text, wildcards * and
? and character classes in square
brackets []. basically, the syntax
provides extended wilcards but no
further logic and is not related to
regular expressions.
search patterns are surrounded by a
separator character which can be
anything not contained in the search
text, like / or _
within a pattern /fromtext/totext/ the
fromtext may contain:
*
0 to 4000 characters in the same text
line or paragraph, i.e. all bytes not
being CR, LF or NULL. 4000 is just a
default maximum that can be changed
by:
[0.100000 chars]
0 to 100000 characters in the same
text line or paragraph, i.e. the same
as * but with a larger range.
?
one character.
?????
same as [5.5 chars] or [5 chars]
[bytes]
0 to 4000 bytes (with CR,LF,NULL) i.e.
it collects stream text across lines,
even in binary data
**
the same as [bytes].
[0.100 bytes]
0 to 100 bytes
[.100000 bytes]
up to 100000 bytes
[1.* bytes]
1 to default maximum bytes
[2 chars]
exactly 2 chars
[30 bytes]
exactly 30 bytes
[byte of aeiou]
one vocal (a OR A OR e OR ...), case
insensitive by default. "aeiou" is a
character list.
[byte of \\\x2f]
a backslash \ or forw. slash /
[bytes of \r\n \t]
whitespace incl. line ends
[bytes of (\r\n \t)]
the same, () are optional
[bytes not \r\n\0]
up to 4000 bytes as long as no CR, LF
or NULL byte appears
[chars]
the same as [bytes not \r\n\0], i.e.
collect text in a line
[char not ( \t)]
same as [byte not ( \r\n\0\t)],
everything not blanks and tabs
[char not )( \t]
not brackets, blanks and tabs, same as
not (\(\) \t)
[chars of a-z0-9]
means a-zA-Z0-9 as search is case
insensitive by default
[chars of \x61-\x7A]
search a-z but not A-Z, or use option
-case for case search
[eol]
end of line by characters: CRLF or LF
or CR
[white]
chars of (\t ) - 0 or more
whitespaces
[xwhite]
bytes of (\t \r\n) - same but across
lines
[1 white]
byte of (\t ) - 1 whitespace
[digit]
byte of (0-9) - 1 digit
[digits]
bytes of (0-9) - 0 or more digits
[hexdigit]
byte of (0-9a-f) - 1 hexadecimal
digit
[hexdigits]
bytes of (0-9a-f) - 0 or more hex
digits
special keywords that do not count as
tokens:
[skip]
at the start of a pattern: skip such
text completely, do not count it as a
search hit.
[keep]
search also the following text but
keep it in the input data, without
consuming it.
[ortext]
foo[ortext]bar searches word foo or
bar. [ortext] is allowed only between
literals.
anchors that have no length of their own:
[start]
start of file
[end]
end of file
[lstart]
line start, i.e. start or CRLF or CR
or LF
[lend]
logical line end, i.e. eol or end of
file. to replace line ends use [eol]
instead.
how to search or replace special
characters:
- to search or replace text containing
the literal characters * ? \ [ ]
then these must be escaped like \* \?
\\ \[ \]
- ( ) are escaped only within
character lists, like \( \)
- to search or replace the forward
slash '/' type \x2f or use another
char around from/to text, e.g.
_fromtext_totext_
- parameters with blanks and non
trivial characters need double quotes
"", see also "about Shell Command
Characters" below.
expansion priorities: (highest first)
if two search parts are side by side, and
the same input character matches both,
then these priorities
apply:
5: start, end, lstart, lend
4: literal text, eol
3: whitelist classes: byte of, bytes of
2: blacklist classes: chars not,
bytes not
1: plain wildcards: ?, *, **, byte,
bytes, chars
this means in "/[bytes]foo/" the [bytes]
will stop to collect characters as soon
as "foo" is found, as "foo" is a literal.
on same or higher priority the right side
stops the left side.
avoid overlapping character groups. for
example, [chars][white]
cannot work, as space and tab are part of
chars. to fix this
extend chars by relevant exclusions:
[chars not ( \t)][white]
the totext may contain:
[part 1]
use first text part of the fromtext.
e.g. the fromtext
/*foo[.100
chars]bar*/ contains
parts : 1 2
3 4 5
[part1]
the same (blank is optional).
[parts 1,2,3]
use parts 1, 2 and 3.
[parts 1-10]
use parts 1 to 10.
[strip(part1,\0)]
use part 1 but remove zero bytes.
only zero bytes "\0"
can be removed.
[file.name]
full input filename with path
[file.relname]
input filename without path
[file.path]
input file's path
[file.base]
relname without last .extension
[file.ext]
input filename extension
[all]
use all parts from fromtext.
[setvar name]...[endvar]
set variable "name" with data
between setvar
and endvar.
[getvar name]
fill in data from variable "name"
although anchors like lstart, lend count
as a separate part they need NOT be
specified in the totext. this means that /
[lstart]foo[lend]/bar/ just changes the
word "foo".
if replace looses line endings in output
in output
- when using [eol] in most cases you
should add [part...] to the output
pattern, to copy the actual found line
separators, or line endings may get lost.
supported slash patterns
\t = TAB
\r = CR
\n = LF
\x00 = one byte with code 00
hexadecimal
\0 = short form for \x00
\q = a double quote "
\\ = the backslash character \
itself
\[ = the bracket open character [
\] = the bracket close character ]
\* = the literal star character *
\? = the literal question mark ?
\- = to use literal "-" in a command
Within multi line -bylist files:
\ = slash+blank is changed to a
single blank
Only within "char of" or "byte not"
lists: \( = to use literal
character "(" \) = to use literal
character ")"
SFK expression options
-showpart(s) print /from/ part numbers,
range statistics and
expansion priority points
per part. done
automatically if a
required /to/ text is not
given with a command.
-showbest if a /from/ pattern finds
nothing, use this to see
how many parts would match
so far, and with up to how
many bytes per part.
anchors like [lstart] may
show a non zero length
when matching (CR)LF.
-showlist with -bylist, show the
internal joined list
if
commands are spread across
multiple lines.
-showall show all of the above.
-xmaxlen=n set default maximum length
for chars or bytes
commands, e.g.
-xmaxlen=10000 means /
foo*bar/ matches with up
to 10000 characters
between foo and bar. the
default max length without
this option is 4000
characters.
performance notes
- always use a string literal, or single
byte or char, at the start of your
search expressions, like in /foo*bar/
starting with 'f'. Do not use a
wildcard like * at the start like in /
*foobar/ when searching huge input data,
as your search will slow down by
factor 256. Use /[lstart]*foobar/
instead.
- the system may cache output file(s),
writing to disk in background after sfk
has finished. subsequent batch commands
may execute slower.
office file support
sfk ofind search in .xml text
file contents of
office files like .docx
.xlsx .ods .odt.
sfk help office for more infos and
options
see also
sfk xfind search wildcard text in
plain text files
sfk ofind search in office files
.docx .xlsx .ods
sfk xfindbin search wildcard text in
text/binary
files
sfk xhexfind search in text/binary
with hex dump
output
sfk extract extract wildcard data
from text/binary files
sfk filter filter and edit text
with simple wildcards
sfk find search fixed text in
text
files
sfk findbin search fixed text in
text/binary
files
sfk hexfind search fixed text in
binary
files
sfk replace replace fixed text in
text/binary files
sfk view GUI tool to search text
as you type
sfk replace replace fixed text
with high performance
sfk xreplace replace wildcard text in
text/binary files
beware of Shell Command Characters.
to find or replace text patterns
containing spaces or special
characters like <>|!&?* you
must add quotes "" around parameters
or the shell environment will destroy
your command. for example, pattern /
foo bar/other/ must be written like "/
foo bar/other/" within a .bat or .cmd
file the percent % must be escaped
like %% even within quotes: sfk echo
-spat "percent %% is a percent \x25"
about example numbers with [brackets]
if you see [1] type "sfk cmd 1" for
whole command in one line.
bad examples with corrections
if input text contains:
bool bClFoo;
bool bClBar ;
sfk xfind in.txt
"/bool[xwhite]bCl*[xwhite];/"
does NOT match "bool bClFoo;" because
* eats the whole input line including
";" so no input is left for
"[xwhite];" and the whole expression
fails.
sfk xfind in.txt "/bool[xwhite]bCl[*
not ;][xwhite];/"
does both match "bool bClFoo;" and
"bool bClBar ;". this means
whenever your search fails to work
write in detail which characters
(not) to collect where.
sfk xex in.txt
"/[lstart]foo/[lstart]goo/"
there is no need to write an anchor
like [lstart] within totext as it
contains no data. use instead:
sfk xex in.txt "/[lstart]foo/goo/"
sfk xex in.txt
"/foo[lend]bar/goo[part2]bar/"
anchors like [lend] must be at start
or end of fromtext and cannot be
referenced within totext. use
instead:
sfk xex in.txt
"/foo[eol]bar/goo[part2]bar/"
working examples
sfk xrep mydir "/foo*bar/"
an incomplete command (missing
"totext" part in pattern). sfk shows
an info text telling about part
numbers and runs a search for
"foo*bar" in all files of mydir.
nothing is changed so far.
sfk xrep mydir
"/foo*bar/[part1]goo[part3]/"
same as above, but now the
/fromtext/totext/ is complete. again
sfk runs a search for "foo*bar", but
now it displays the changed output
text (totext), with everything
between "foo" and "bar" being changed
to "goo". add option -dumpfrom to
display the original found text
instead.
sfk sel mydir .txt +xrep
"/foo*bar/[part1]goo[part3]/"
similar to above, replace in all .txt
files of mydir.
sfk xrep -text "/class*
CFoo/[part1][part3]/" -dir mydir -file .
hpp
search only .hpp files within mydir,
and replace for example "class IMPORT
CFoo" by "class CFoo".
sfk xrep -pat "/[byte not \
n][end]/[part1]\n/"
-dir mydir -file .cpp .hpp -dumpall
find all .cpp or .hpp files in mydir
whose last line is not ending with a
linefeed, and add the linefeed. to
check exactly what is changed dump
both input and output text. [23]
sfk xrep -dir mydir -file .hpp -enddir
-text "/[byte not \n][end]/[part1]\n/"
-dumpall
same as above but with dir parameters
first. [25]
sfk xrep io.txt "/[lstart][20
chars]*/[part3]/"
cut first 20 characters in every line
of io.txt.
sfk xrep io.txt "/[lstart][9
bytes]1001*/[part2]9009[part4]/"
in fixed position text file data like:
rec. 001:5318 aef3 2751 1001
rec. 002:1001 aef5 275a 1001
rec. 003:ef49 aef7 2763 1001
replace "1001" where it appears in
columns 10 to 13, in this example
only the first "1001" in record 2.
sfk xrep in.dat "/\xFF\xFE[1 byte]\x80\
x81/\xFF\xFE\x00\x80\x81/"
replace byte sequences (not ASCII
text strings) in binary data.
searches byte groups starting with
values 0xFF 0xFE, then any single
byte, then 0x80 0x81, and replaces
the variable byte by always a binary
0x00 value.
sfk xreplace in.txt "/foo*bar/other/"
replace phrases starting with "foo"
and ending with "bar" by word "other"
in single file in.txt
sfk xreplace -text
"/foo*bar/===[part2]===/" -dir mydir
-file .txt
replace foo*bar in all .txt files of
folder mydir with a new pattern
containing the text between foo and
bar surrounded by "===".
sfk xrep -text "/\x66\x6f\x6f[0.100
bytes]\x62\x61\x72/---/"
-dir mydir -file .dat
replace binary data starting with
bytes 0x66, 0x6f, 0x6f, ending with
0x62, 0x61, 0x72 and up to 100 bytes
inbetween by "---" within all .dat
files of folder mydir. [24]
|

