Rectangle 27 140

Just used FART ("F ind A nd R eplace T ext" command line utility): excellent little freeware for text replacement within a large set of files.

fart.exe -p -r -c -- C:\tools\perl-5.8.9\* @@APP_DIR@@ C:\tools

will preview the replacements to do recursively in the files of this Perl distribution.

Only problem: the FART website icon isn't exactly tasteful, refined or elegant ;)

The cool thing is it's one single exe. No dependencies. No small prints. Super easy to deploy.

Very lightweight and easy to use, but I was hoping it would print out the exact places that replacements took place. Not being able to see that gave me a sense of insecurity.

Thanks, it's perfect, should be part of the standard dos tools and worked a charm. The -p option however doesn't show you how many changes it 'would' make and always reports 0 which threw me for a few mins

I understand this is a very old question but I found more information and hope it will be helpful to Stack Overflow users. Just another link for FART where product is well explained: FART explaned @emtunc.org and another page can be found here: FART Please be careful with the replacement of / and ' as this is not working for all of us, for me it worked in some cases but it didn't work on some files and I don't know why.. I used this to replace text with other text and a /

How can you find and replace text in a file using the Windows command-...

windows command-line scripting batch-file text-files
Rectangle 27 139

Just used FART ("F ind A nd R eplace T ext" command line utility): excellent little freeware for text replacement within a large set of files.

fart.exe -p -r -c -- C:\tools\perl-5.8.9\* @@APP_DIR@@ C:\tools

will preview the replacements to do recursively in the files of this Perl distribution.

Only problem: the FART website icon isn't exactly tasteful, refined or elegant ;)

The cool thing is it's one single exe. No dependencies. No small prints. Super easy to deploy.

Very lightweight and easy to use, but I was hoping it would print out the exact places that replacements took place. Not being able to see that gave me a sense of insecurity.

Thanks, it's perfect, should be part of the standard dos tools and worked a charm. The -p option however doesn't show you how many changes it 'would' make and always reports 0 which threw me for a few mins

I understand this is a very old question but I found more information and hope it will be helpful to Stack Overflow users. Just another link for FART where product is well explained: FART explaned @emtunc.org and another page can be found here: FART Please be careful with the replacement of / and ' as this is not working for all of us, for me it worked in some cases but it didn't work on some files and I don't know why.. I used this to replace text with other text and a /

How can you find and replace text in a file using the Windows command-...

windows command-line scripting batch-file text-files
Rectangle 27 140

Just used FART ("F ind A nd R eplace T ext" command line utility): excellent little freeware for text replacement within a large set of files.

fart.exe -p -r -c -- C:\tools\perl-5.8.9\* @@APP_DIR@@ C:\tools

will preview the replacements to do recursively in the files of this Perl distribution.

Only problem: the FART website icon isn't exactly tasteful, refined or elegant ;)

The cool thing is it's one single exe. No dependencies. No small prints. Super easy to deploy.

Very lightweight and easy to use, but I was hoping it would print out the exact places that replacements took place. Not being able to see that gave me a sense of insecurity.

Thanks, it's perfect, should be part of the standard dos tools and worked a charm. The -p option however doesn't show you how many changes it 'would' make and always reports 0 which threw me for a few mins

I understand this is a very old question but I found more information and hope it will be helpful to Stack Overflow users. Just another link for FART where product is well explained: FART explaned @emtunc.org and another page can be found here: FART Please be careful with the replacement of / and ' as this is not working for all of us, for me it worked in some cases but it didn't work on some files and I don't know why.. I used this to replace text with other text and a /

How can you find and replace text in a file using the Windows command-...

windows command-line scripting batch-file text-files
Rectangle 27 3

Just download fart (find and replace text) from here

fart -r "C:\myfolder\*.*" findSTR replaceSTR

this command will search in C:\myfolder and all sub-folders and replace findSTR with replaceSTR

Batch script to find and replace a string in text file within a minute...

batch-file replace
Rectangle 27 1

To replace all combinations of '\r\n' with '\r\n' use:

$result = preg_replace('/[\r\n]+/', "\r\n", $text);

This will also replace single '\r' or '\n' with '\r\n'.

regex - Replace any combination of (CR) and (LF) with a single (CRLF) ...

php regex preg-replace str-replace
Rectangle 27 37

In theory, you regular express does work but the problem is that not all operating system and browsers send only \n at the end of string. Many will also send a \r.

preg_replace("/(\r?\n){2,}/", "\n\n", $text);
preg_replace("/[\r\n]{2,}/", "\n\n", $text);
// Replace multiple (one ore more) line breaks with a single one.
$text = preg_replace("/[\r\n]+/", "\n", $text);

$text = wordwrap($text,120, '<br/>', true);
$text = nl2br($text);

@Sourav - In your example above, you were replacing 2 or more \n with a 2 \n hence my example. If you only want 1 (i.e. to skip a line but without leaving an empty line in-between), simply replace the \n\n with \n.

php - Replace Multiple Newline, Tab, Space - Stack Overflow

php regex preg-replace
Rectangle 27 3

It is possible that your newlines are represented as \r\n. In order to replace them you should do:

text.replace('\r\n', ' $ ')

For a portable solution that works on both UNIX-like systems (which uses \n) and Windows (which uses \r\n), you can substitute the text using a regex:

>>> import re
>>> re.sub('\r?\n', ' $ ', 'a\r\nb\r\nc')
'a $ b $ c'
>>> re.sub('\r?\n', ' $ ', 'a\nb\nc')
'a $ b $ c'

An additional comment: I would recommend doing first text = text.replace('\r\n','\n') and then doing text = text.replace('\n', ' $ '). This works for all files, for the ones having '\n' line separators and for the ones having '\r\n' line separators.

Check the edit, I think the best solution is with a regex

Replace '\n' in a string in Python 2.7 - Stack Overflow

python string replace newline
Rectangle 27 1

The file you uploaded is not a fixed width file:

I am not a SAS user, but from looking at the SAS code in your post, the column widths in the code do not match up with those in the file.

It appears that there are many carriage return / new lines which do not belong there - in particular they seem to be used in places as a delimiter. There should be one CRLF at the end of each line, and that's it.

Since you say that SAS opens it, I suggest you use save to a CSV format in SAS and then open it in R. Alternatively you could remove the superfluous CRLF using a good text editor/processor, leaving a single CRLF at the end of each line. Since it appears that each "real" line begins with "DP" you could try to do a replace of -CRLF-DP with (say) -tab- then delete all -CRLF-s then replace all -tab-s with -CRLF- (this relies on their being no -tab-s in the file already)

But SAS reds it nicely, without any problem. That CRLF does not appear too when I opened in notepad. Actually It is to be done in R. Cant use SAS, so any suggestions on that please??

@sayak just remove the excess carriage return line feeds as I suggested. Notepad++ can do it. Just choose "Search->Replace", select Extended mode, put "\r\n" as the text to search for (the escape code for the carriage return new line) and "" (ie nothing) to replace it with. Then replace all occurances of "DP" with "\r\nDP". This assumes the only occurances of DP are at the start of every line. I just did this myself and it works. Takes a few seconds.

Hi Robert its a long time. The requirement was not there at that time. I have been working with this now. So I did this thing in notepad++ and the file appears good. I read it with read.table it reads but not in proper alignment with the column names. Can you please tell how did it work for you ? What was the code you used after changing in notepad++?

Reading a fixed length text file in R - Stack Overflow

r
Rectangle 27 11

The simple answer is that csv files should always be opened in binary mode whether for input or output, as otherwise on Windows there are problems with the line ending. Specifically on output the csv module will write \r\n (the standard CSV row terminator) and then (in text mode) the runtime will replace the \n by \r\n (the Windows standard line terminator) giving a result of \r\r\n.

Fiddling with the lineterminator is NOT the solution.

What is this CSV "standard" of which you speak?

@Dan: I used "standard" as an adjective, not a noun, meaning "usual" or "commonplace". If you want an approximation to a (noun) standard, read tools.ietf.org/html/rfc4180

Point is (as you imply) that there is no standard. That RFE is Informational. While \r\n may be "standard" on Windows, I'm sure Unix applications typically don't see it that way.

@Dan: That is correct -- there is no standard. Scripts should specify the lineterminator [should have been named ROWterminator] that they want (if not the default) and still use binary mode in case the script is run on Windows otherwise the "lineterminator" may be stuffed up.

CSV file written with Python has blank lines between each row - Stack ...

python csv
Rectangle 27 11

The simple answer is that csv files should always be opened in binary mode whether for input or output, as otherwise on Windows there are problems with the line ending. Specifically on output the csv module will write \r\n (the standard CSV row terminator) and then (in text mode) the runtime will replace the \n by \r\n (the Windows standard line terminator) giving a result of \r\r\n.

Fiddling with the lineterminator is NOT the solution.

What is this CSV "standard" of which you speak?

@Dan: I used "standard" as an adjective, not a noun, meaning "usual" or "commonplace". If you want an approximation to a (noun) standard, read tools.ietf.org/html/rfc4180

Point is (as you imply) that there is no standard. That RFE is Informational. While \r\n may be "standard" on Windows, I'm sure Unix applications typically don't see it that way.

@Dan: That is correct -- there is no standard. Scripts should specify the lineterminator [should have been named ROWterminator] that they want (if not the default) and still use binary mode in case the script is run on Windows otherwise the "lineterminator" may be stuffed up.

CSV file written with Python has blank lines between each row - Stack ...

python csv
Rectangle 27 11

The simple answer is that csv files should always be opened in binary mode whether for input or output, as otherwise on Windows there are problems with the line ending. Specifically on output the csv module will write \r\n (the standard CSV row terminator) and then (in text mode) the runtime will replace the \n by \r\n (the Windows standard line terminator) giving a result of \r\r\n.

Fiddling with the lineterminator is NOT the solution.

What is this CSV "standard" of which you speak?

@Dan: I used "standard" as an adjective, not a noun, meaning "usual" or "commonplace". If you want an approximation to a (noun) standard, read tools.ietf.org/html/rfc4180

Point is (as you imply) that there is no standard. That RFE is Informational. While \r\n may be "standard" on Windows, I'm sure Unix applications typically don't see it that way.

@Dan: That is correct -- there is no standard. Scripts should specify the lineterminator [should have been named ROWterminator] that they want (if not the default) and still use binary mode in case the script is run on Windows otherwise the "lineterminator" may be stuffed up.

CSV file written with Python has blank lines between each row - Stack ...

python csv
Rectangle 27 1

If you can use GNU sed, you could put your text in a file and replace the offending lines with the r command. If you can not, you need to fix your quoting. Or, you could use something like:

{ sed 11q old_file; cat replacement_text; sed 1,14d old_file; } > new_file

php env.php edit from bash script using sed, awk or another tool - Sta...

bash awk sed replace
Rectangle 27 1

CREATE OR REPLACE FUNCTION number_to_base(num BIGINT, base INTEGER)
  RETURNS TEXT
  LANGUAGE sql
  IMMUTABLE
  STRICT
AS $function$
WITH RECURSIVE n(i, n, r) AS (
    SELECT -1, num, 0
  UNION ALL
    SELECT i + 1, n / base, (n % base)::INT
    FROM n
    WHERE n > 0
)
SELECT string_agg(ch, '')
FROM (
  SELECT CASE
           WHEN r=0 then 'z'
           WHEN r=1 then 'q'
           WHEN r=2 then 'w'
           WHEN r=3 then 'k'
           WHEN r=4 then 's'
           WHEN r=5 then 'g'
           WHEN r=6 then 'v'
           WHEN r=7 then '2'
           WHEN r=8 then '7'
           WHEN r=9 then 'l'
           WHEN r=10 then 'b'
           WHEN r=11 then 'p'
           WHEN r=12 then 'n'
           WHEN r=13 then 'h'
           WHEN r=14 then '1'
           WHEN r=15 then '3'
           WHEN r=16 then 'm'
           WHEN r=17 then 'o'
           WHEN r=18 then 'e'
           WHEN r=19 then 'u'
           WHEN r=20 then 'r'
           WHEN r=21 then 'i'
           WHEN r=22 then '4'
           WHEN r=23 then 'j'
           WHEN r=24 then 'y'
           WHEN r=25 then '0'
           WHEN r=26 then 'd'
           WHEN r=27 then 'x'
           WHEN r=28 then 'f'
           WHEN r=29 then '9'
           WHEN r=30 then '5'
           WHEN r=31 then '8'
           WHEN r=32 then '6'
           WHEN r=33 then 't'
           WHEN r=34 then 'c'
           WHEN r=35 then 'a'
           WHEN r=36 then 'C'
           WHEN r=37 then 'E'
           WHEN r=38 then 'Z'          
           WHEN r=39 then 'H'
           WHEN r=40 then 'Y'
           WHEN r=41 then 'I'
           WHEN r=42 then 'W'
           WHEN r=43 then 'Q'
           WHEN r=44 then 'M'
           WHEN r=45 then 'L'
           WHEN r=46 then 'P'
           WHEN r=47 then 'O'
           WHEN r=48 then 'K'
           WHEN r=49 then 'X'
           WHEN r=50 then 'S'
           WHEN r=51 then 'A'
           WHEN r=52 then 'U'
           WHEN r=53 then 'R'
           WHEN r=54 then 'V'
           WHEN r=55 then 'G'
           WHEN r=56 then 'B'
           WHEN r=57 then 'D'
           WHEN r=58 then 'N'
           WHEN r=59 then 'J'
           WHEN r=60 then 'F'
           WHEN r=61 then 'T'
           ELSE '%'
         END ch
  FROM n
  WHERE i >= 0
  ORDER BY i DESC
) ch
$function$;


select 
number_to_base((((gs % 
9)+1)::text||reverse(((gs*107)+20000000)::text))::bigint,62), gs
from generate_series(1,1000) gs

The 107 is used to create 2 character differences. The 20 million hard coded in is used to make sure you have 20 Mil codes that all have the same number of digits. The other features are used to help make the data appear pseudo random, however the procedure is totally able to be reversed using the data, or re-generated based on the ID.

I limited the results to 1000 simply for testings sake. It's much longer, but more random looking than other solutions. Probably a more elegant way, but this is fast and repeatable.

algorithm - Unique string with at least two differences - Stack Overfl...

algorithm postgresql random
Rectangle 27 1

var text='<div id="main"><div class="replace">&lt; **My Text** &gt;</div><div>Test</div></div>'

var r = /(<(div|DIV)\s+class\s*?=('|")\s*?replace('|")\s*?>)(\s*?&lt;)(.*?)(&gt;\s*?)(<\/(div|DIV)\s*?>)/g;

The whole replacement can be made with:

text.replace(r, function () {
         return 'Hello' + arguments[6] + 'Hello';
    });

Please let me know if there are issues with the solution :).

Btw: I'm totally against regexes like the one in the answer...If you have made it with that complex regex there's probably better way to handle the problem...

<div id="main"> Hello **My Text** Hello <div>Test</div></div>
<div id="main"><div class="replace">&lt; **My Text** &gt;</div><div>Test</div></div>
var r = /(<\s*(div|DIV)\s*(id|ID)\s*=\s*('|")main"\s*>)(.*?)((<\s*(div|DIV)\s*>)(.*?)(<\/\s*(div|DIV)\s*>)\s*(<\/\s*(div|DIV)\s*>))/;

Javascript regex to replace text div and < > - Stack Overflow

javascript regex
Rectangle 27 80

library(RCurl)
x <- getURL("https://raw.github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv")
y <- read.csv(text = x)

You have two problems:

  • You're not linking to the "raw" file, but Github's display verion (visit the URL for https:\raw.github.com....csv to see the difference between the raw version and the display version).
  • https is a problem for R in many cases, so you need to use a package like RCurl to get around it. In some cases (not with Github, though) you can simply replace https with http and things work out, so you can always try that out first, but I find using RCurl reliable and not too much extra typing.
Error in function (type, msg, asError = TRUE)  :    SSL certificate problem: unable to get local issuer certificate
y <- read.csv(text=getURL("https://raw.github.com/aronlindberg/latent_growth_classes/master/LGC_data.csv"))

data manipulation - Read a CSV from github into R - Stack Overflow

r data-manipulation data-management
Rectangle 27 3

M-X Dired, and t to mark all files, and Q to query replace text in all of them. You can expand a sub directory by using the i command before the query-replace. They key info I'm adding is that if you give a prefix (control-u) to the i command, it will prompt you for arg, and -R argument will recursively expand all subdirs into the dired buffer. So now you can query-search every file in an entire directory.

editor - Using Emacs to recursively find and replace in text files not...

emacs editor
Rectangle 27 11

This is a pure R solution to the challenge of sampling from a large text file; it has the additional merit of drawing a random sample of exactly n. It is not too inefficient, though lines are parsed to character vectors and this is relatively slow.

We start with a function signature, where we provide a file name, the size of the sample we want to draw, a seed for the random number generator (so that we can reproduce our random sample!), an indication of whether there's a header line, and then a "reader" function that we'll use to parse the sample into the object seen by R, including additional arguments ... that the reader function might need

fsample <-
    function(fname, n, seed, header=FALSE, ..., reader=read.csv)
{

The function seeds the random number generator, opens a connection, and reads in the (optional) header line

set.seed(seed)
    con <- file(fname, open="r")
    hdr <- if (header) {
        readLines(con, 1L)
    } else character()

The next step is to read in a chunk of n lines, initializing a counter of the total number of lines seen

buf <- readLines(con, n)
    n_tot <- length(buf)

Continue to read in chunks of n lines, stopping when there is no further input

repeat {
        txt <- readLines(con, n)
        if ((n_txt <- length(txt)) == 0L)
            break

For each chunk, draw a sample of n_keep lines, with the number of lines proportional to the fraction of total lines in the current chunk. This ensures that lines are sampled uniformly over the file. If there are no lines to keep, move to the next chunk.

n_tot <- n_tot + n_txt
        n_keep <- rbinom(1, n_txt, n_txt / n_tot)
        if (n_keep == 0L)
            next

Choose the lines to keep, and the lines to replace, and update the buffer

keep <- sample(n_txt, n_keep)
        drop <- sample(n, n_keep)
        buf[drop] <- txt[keep]
    }

When data input is done, we parse the result using the reader and return the result

reader(textConnection(c(hdr, buf), header=header, ...)
}

The solution could be made more efficient, but a bit more complicated, by using readBin and searching for line breaks as suggested by Simon Urbanek on the R-devel mailing list. Here's the full solution

fsample <-
    function(fname, n, seed, header=FALSE, ..., reader = read.csv)
{
    set.seed(seed)
    con <- file(fname, open="r")
    hdr <- if (header) {
        readLines(con, 1L)
    } else character()

    buf <- readLines(con, n)
    n_tot <- length(buf)

    repeat {
        txt <- readLines(con, n)
        if ((n_txt <- length(txt)) == 0L)
            break

        n_tot <- n_tot + n_txt
        n_keep <- rbinom(1, n_txt, n_txt / n_tot)
        if (n_keep == 0L)
            next

        keep <- sample(n_txt, n_keep)
        drop <- sample(n, n_keep)
        buf[drop] <- txt[keep]
    }

    reader(textConnection(c(hdr, buf)), header=header, ...)
}

Thank you for posting your code, and thank you for the excellent documentation. Would you happen to be able to point me towards and example using readBin? Thanks!

memory management - Reading 40 GB csv file into R using bigmemory - St...

r memory-management file-io
Rectangle 27 11

This is a pure R solution to the challenge of sampling from a large text file; it has the additional merit of drawing a random sample of exactly n. It is not too inefficient, though lines are parsed to character vectors and this is relatively slow.

We start with a function signature, where we provide a file name, the size of the sample we want to draw, a seed for the random number generator (so that we can reproduce our random sample!), an indication of whether there's a header line, and then a "reader" function that we'll use to parse the sample into the object seen by R, including additional arguments ... that the reader function might need

fsample <-
    function(fname, n, seed, header=FALSE, ..., reader=read.csv)
{

The function seeds the random number generator, opens a connection, and reads in the (optional) header line

set.seed(seed)
    con <- file(fname, open="r")
    hdr <- if (header) {
        readLines(con, 1L)
    } else character()

The next step is to read in a chunk of n lines, initializing a counter of the total number of lines seen

buf <- readLines(con, n)
    n_tot <- length(buf)

Continue to read in chunks of n lines, stopping when there is no further input

repeat {
        txt <- readLines(con, n)
        if ((n_txt <- length(txt)) == 0L)
            break

For each chunk, draw a sample of n_keep lines, with the number of lines proportional to the fraction of total lines in the current chunk. This ensures that lines are sampled uniformly over the file. If there are no lines to keep, move to the next chunk.

n_tot <- n_tot + n_txt
        n_keep <- rbinom(1, n_txt, n_txt / n_tot)
        if (n_keep == 0L)
            next

Choose the lines to keep, and the lines to replace, and update the buffer

keep <- sample(n_txt, n_keep)
        drop <- sample(n, n_keep)
        buf[drop] <- txt[keep]
    }

When data input is done, we parse the result using the reader and return the result

reader(textConnection(c(hdr, buf), header=header, ...)
}

The solution could be made more efficient, but a bit more complicated, by using readBin and searching for line breaks as suggested by Simon Urbanek on the R-devel mailing list. Here's the full solution

fsample <-
    function(fname, n, seed, header=FALSE, ..., reader = read.csv)
{
    set.seed(seed)
    con <- file(fname, open="r")
    hdr <- if (header) {
        readLines(con, 1L)
    } else character()

    buf <- readLines(con, n)
    n_tot <- length(buf)

    repeat {
        txt <- readLines(con, n)
        if ((n_txt <- length(txt)) == 0L)
            break

        n_tot <- n_tot + n_txt
        n_keep <- rbinom(1, n_txt, n_txt / n_tot)
        if (n_keep == 0L)
            next

        keep <- sample(n_txt, n_keep)
        drop <- sample(n, n_keep)
        buf[drop] <- txt[keep]
    }

    reader(textConnection(c(hdr, buf)), header=header, ...)
}

Thank you for posting your code, and thank you for the excellent documentation. Would you happen to be able to point me towards and example using readBin? Thanks!

memory management - Reading 40 GB csv file into R using bigmemory - St...

r memory-management file-io