Rectangle 27 9

Python 2+3: Reading a CSV file

Here are some minimal complete examples how to read CSV files and how to write CSV files with Python.

# -*- coding: utf-8 -*-
import csv

# Define data
data = [(1, "A towel,", 1.0),
        (42, " it says, ", 2.0),
        (1337, "is about the most ", -1),
        (0, "massively useful thing ", 123),
        (-2, "an interstellar hitchhiker can have.", 3)]

# Write CSV file
with open('test.csv', 'w') as fp:
    writer = csv.writer(fp, delimiter=',')
    # writer.writerow(["your", "header", "foo"])  # write header
    writer.writerows(data)

# Read CSV file
with open('test.csv', 'r') as fp:
    reader = csv.reader(fp, delimiter=',', quotechar='"')
    # next(reader, None)  # skip the headers
    data_read = [row for row in reader]

print(data_read)

After that, the contents of data_read are

[['1', 'A towel,', '1.0'],
 ['42', ' it says, ', '2.0'],
 ['1337', 'is about the most ', '-1'],
 ['0', 'massively useful thing ', '123'],
 ['-2', 'an interstellar hitchhiker can have.', '3']]

If you want to write Unicode, you have to install unicodecsv. Do not open the file with codecs.open but simply with open. Write it with

import unicodecsv as csv
# Write CSV file
with open('test.csv', 'w') as fp:
    writer = csv.writer(fp, encoding='utf-8')
    # writer.writerow(["your", "header", "foo"])  # write header
    writer.writerows(data)
  • How can I use io.StringIO() with the csv module?: This is interesting if you want to serve a CSV on-the-fly with Flask, without actually storing the CSV on the server.
import pandas as pd

# Read the CSV into a pandas data frame (df)
#   With a df you can do many things
#   most important: visualize data with Seaborn
df = pd.read_csv('myfile.csv', sep=',')
print(df)

# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]

# or export it as a list of dicts
dicts = df.to_dict().values()

See read_csv docs for more information. Please note that pandas automatically infers if there is a header line, but you can set it manually, too.

Reading CSV files is supported by a bunch of other libraries, for example:

dask.dataframe.read_csv
spark.read.csv
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3
.csv

After reading the CSV file to a list of tuples / dicts or a Pandas dataframe, it is simply working with this kind of data. Nothing CSV specific.

  • CSV: Super simple format (read & write)

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

Why not use csv itself for reading?

@icedwater This is a possibility. However, I prefer Pandas: (1) It automatically deals with headers (2) it loads the file directly from the path and does not expect a file pointer (3) it has better "export" options (like the dict export - yes, you can do that with CSV, too. But Pandas is simpler). But feel free to post a solution with does not need Pandas :-)

Thanks, I was wondering because you used csv for writing. I would prefer csv or pandas for both, and csv over pandas because it is more likely to already be there.

@icedwater Ok, I've added a pure csv solution (which is now also consistent in structure with my other answers to the other file formats like YAML and JSON)

Sign up for our newsletter and get our top new questions delivered to your inbox (see an example).

How do I read and write CSV files with Python? - Stack Overflow

python csv
Rectangle 27 9

Python 2+3: Reading a CSV file

Here are some minimal complete examples how to read CSV files and how to write CSV files with Python.

# -*- coding: utf-8 -*-
import csv

# Define data
data = [(1, "A towel,", 1.0),
        (42, " it says, ", 2.0),
        (1337, "is about the most ", -1),
        (0, "massively useful thing ", 123),
        (-2, "an interstellar hitchhiker can have.", 3)]

# Write CSV file
with open('test.csv', 'w') as fp:
    writer = csv.writer(fp, delimiter=',')
    # writer.writerow(["your", "header", "foo"])  # write header
    writer.writerows(data)

# Read CSV file
with open('test.csv', 'r') as fp:
    reader = csv.reader(fp, delimiter=',', quotechar='"')
    # next(reader, None)  # skip the headers
    data_read = [row for row in reader]

print(data_read)

After that, the contents of data_read are

[['1', 'A towel,', '1.0'],
 ['42', ' it says, ', '2.0'],
 ['1337', 'is about the most ', '-1'],
 ['0', 'massively useful thing ', '123'],
 ['-2', 'an interstellar hitchhiker can have.', '3']]

If you want to write Unicode, you have to install unicodecsv. Do not open the file with codecs.open but simply with open. Write it with

import unicodecsv as csv
# Write CSV file
with open('test.csv', 'w') as fp:
    writer = csv.writer(fp, encoding='utf-8')
    # writer.writerow(["your", "header", "foo"])  # write header
    writer.writerows(data)
  • How can I use io.StringIO() with the csv module?: This is interesting if you want to serve a CSV on-the-fly with Flask, without actually storing the CSV on the server.
import pandas as pd

# Read the CSV into a pandas data frame (df)
#   With a df you can do many things
#   most important: visualize data with Seaborn
df = pd.read_csv('myfile.csv', sep=',')
print(df)

# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]

# or export it as a list of dicts
dicts = df.to_dict().values()

See read_csv docs for more information. Please note that pandas automatically infers if there is a header line, but you can set it manually, too.

Reading CSV files is supported by a bunch of other libraries, for example:

dask.dataframe.read_csv
spark.read.csv
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3
.csv

After reading the CSV file to a list of tuples / dicts or a Pandas dataframe, it is simply working with this kind of data. Nothing CSV specific.

  • CSV: Super simple format (read & write)

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

Why not use csv itself for reading?

@icedwater This is a possibility. However, I prefer Pandas: (1) It automatically deals with headers (2) it loads the file directly from the path and does not expect a file pointer (3) it has better "export" options (like the dict export - yes, you can do that with CSV, too. But Pandas is simpler). But feel free to post a solution with does not need Pandas :-)

Thanks, I was wondering because you used csv for writing. I would prefer csv or pandas for both, and csv over pandas because it is more likely to already be there.

@icedwater Ok, I've added a pure csv solution (which is now also consistent in structure with my other answers to the other file formats like YAML and JSON)

How do I read and write CSV files with Python? - Stack Overflow

python csv
Rectangle 27 3

Reading in your CSV file quickly

I wrote a parser for this kind of thing yesterday, but it was for a very specific kind of input to the header file, so I will show you how to format your column widths to be able to use it.

You can download the binary from the bin directory if you are on OS X Mavericks (where I compiled it on) or compile it by going to src and using clang++ csv_iterator.cpp parse.cpp main.cpp -o flatfileparser.

The flat file parser needs two files, a CSV header file in which every fifth element specifies the variable width (again, this is due to my extremely specific application), which you can generate using:

cols = c(8,10,1,2,1,1,1,3,4,3,2,2,4,4,1,4,1,4,1,1,1,1,3,2,2,1,2,2,13,2,4,1,1,1,1,3,3,3,2,3,3,3,3,3,3,3,2,2,2,2,1,1,1,1,1,6,6,6,2,1,1,2,1,1,1,1,1,2,2,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,7,5,4,10,3,3,2,2,2,3,1,1,1,1,2,2,1,1,2,1,9,5,5,1,1,1,2,2,1,1,1,1,1,1,1,1,2,3,3,3,3,3,3,1,4,1,4,1,1,3,3,3,3,2,2,2,2)
writeLines(sapply(c(-1, cols), function(x) paste0(',,,,', x)), '~/tmp/header.csv')

and copying the resulting ~/tmp/header.csv to the same directory as your flatfileparser. Move the flat file to the same directory as well, and you can run it on your flat file:

./flatfileparser header.csv yourflatfile

which will produce yourflatfile.csv. Add the header you have above in manually using piping (>> from Bash).

Use Hadley's experimental fastread package by passing the filename to fastread::read_csv, which yields a data.frame. I don't believe he supports fwf files yet although it is on the way.

mark-mbp-osx:bin mark$ flatfileparser header.csv COLRECT.TXT
-bash: flatfileparser: command not found
mark-mbp-osx:bin mark$ ls COLRECT.TXT	flatfileparser	header.csv
chmod +x flatfileparser; ./flatfileparser header.csv COLRECT.TXT
mark-mbp-osx:bin mark$ chmod +x flatfileparserchmod +x flatfileparser; ./flatfileparser header.csv COLRECT.TXT
chmod: flatfileparserchmod: No such file or directory
chmod: +x: No such file or directory
mark-mbp-osx:bin mark$

I think you pasted the string "chmod +x flatfileparser" twice. Try two separate commands: first chmod +x flatfileparser and then ./flatfileparser header.csv COLRECT.TXT

My fault, I pasted it into SO twice. I ended up with 144 columns instead of 143. It seems to work fine, so thanks. I am not sure I could use this regularly or on our Windows server. It would be great if it were easy to access from within R. I am just not a real programmer.

substring - Faster way to read fixed-width files in R - Stack Overflow

r substring data.table apply lapply
Rectangle 27 3

Reading in your CSV file quickly

I wrote a parser for this kind of thing yesterday, but it was for a very specific kind of input to the header file, so I will show you how to format your column widths to be able to use it.

You can download the binary from the bin directory if you are on OS X Mavericks (where I compiled it on) or compile it by going to src and using clang++ csv_iterator.cpp parse.cpp main.cpp -o flatfileparser.

The flat file parser needs two files, a CSV header file in which every fifth element specifies the variable width (again, this is due to my extremely specific application), which you can generate using:

cols = c(8,10,1,2,1,1,1,3,4,3,2,2,4,4,1,4,1,4,1,1,1,1,3,2,2,1,2,2,13,2,4,1,1,1,1,3,3,3,2,3,3,3,3,3,3,3,2,2,2,2,1,1,1,1,1,6,6,6,2,1,1,2,1,1,1,1,1,2,2,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,7,5,4,10,3,3,2,2,2,3,1,1,1,1,2,2,1,1,2,1,9,5,5,1,1,1,2,2,1,1,1,1,1,1,1,1,2,3,3,3,3,3,3,1,4,1,4,1,1,3,3,3,3,2,2,2,2)
writeLines(sapply(c(-1, cols), function(x) paste0(',,,,', x)), '~/tmp/header.csv')

and copying the resulting ~/tmp/header.csv to the same directory as your flatfileparser. Move the flat file to the same directory as well, and you can run it on your flat file:

./flatfileparser header.csv yourflatfile

which will produce yourflatfile.csv. Add the header you have above in manually using piping (>> from Bash).

Use Hadley's experimental fastread package by passing the filename to fastread::read_csv, which yields a data.frame. I don't believe he supports fwf files yet although it is on the way.

mark-mbp-osx:bin mark$ flatfileparser header.csv COLRECT.TXT
-bash: flatfileparser: command not found
mark-mbp-osx:bin mark$ ls COLRECT.TXT	flatfileparser	header.csv
chmod +x flatfileparser; ./flatfileparser header.csv COLRECT.TXT
mark-mbp-osx:bin mark$ chmod +x flatfileparserchmod +x flatfileparser; ./flatfileparser header.csv COLRECT.TXT
chmod: flatfileparserchmod: No such file or directory
chmod: +x: No such file or directory
mark-mbp-osx:bin mark$

I think you pasted the string "chmod +x flatfileparser" twice. Try two separate commands: first chmod +x flatfileparser and then ./flatfileparser header.csv COLRECT.TXT

My fault, I pasted it into SO twice. I ended up with 144 columns instead of 143. It seems to work fine, so thanks. I am not sure I could use this regularly or on our Windows server. It would be great if it were easy to access from within R. I am just not a real programmer.

substring - Faster way to read fixed-width files in R - Stack Overflow

r substring data.table apply lapply
Rectangle 27 12

Are you really sure that you selected the .csv file and not the .xls file? I can only reproduce the error if I try to read in an .xls file. If I try to read in a .csv file or any other text file, it's impossible to recreate the error you get.

> Data <- read.table("test.csv",header=T,sep=",")
> Data <- read.table("test.xlsx",header=T,sep=",")
Warning message:
In read.table("test.xlsx", header = T, sep = ",") :
  incomplete final line found by readTableHeader on 'test.xlsx'

readTableHead is the c-function that gives the error. It tries to read in the first n lines (standard the first 5 ) to determine the type of the data. The rest of the data is read in using scan(). So the problem is the format of the file.

One way of finding out, is to set the working directory to the directory where the file is. That way you see the extension of the file you read in. I know on Windows it's not shown standard, so you might believe it's csv while it isn't.

The next thing you should do, is open the file in Notepad or Wordpad (or another editor) and check that the format is equivalent to my file test.csv:

Test1,Test2,Test3
1,1,1
2,2,2
3,3,3
4,4,
5,5,
,6,

This file will give you the following dataframe :

> read.table(testfile,header=T,sep=",")
  Test1 Test2 Test3
1     1     1     1
2     2     2     2
3     3     3     3
4     4     4    NA
5     5     5    NA
6    NA     6    NA

The csv format saved by excel seperates all cells with a comma. Empty cells just don't have a value. read.table() can easily deal with this, and recognizes empty cells just fine.

Assuming this is a Windows 7 environment, if Kate looks at the file either copied to the desktop or inside the folder, the icon for a .csv file has an "a" on it, whereas an .xlsx file has an icon that looks more like a worksheet. This is a quick visual way of determining file type. Much easier to see when saved onto the desktop as the icons are larger. :)

'Incomplete final line' warning when trying to read a .csv file into R...

r
Rectangle 27 12

Are you really sure that you selected the .csv file and not the .xls file? I can only reproduce the error if I try to read in an .xls file. If I try to read in a .csv file or any other text file, it's impossible to recreate the error you get.

> Data <- read.table("test.csv",header=T,sep=",")
> Data <- read.table("test.xlsx",header=T,sep=",")
Warning message:
In read.table("test.xlsx", header = T, sep = ",") :
  incomplete final line found by readTableHeader on 'test.xlsx'

readTableHead is the c-function that gives the error. It tries to read in the first n lines (standard the first 5 ) to determine the type of the data. The rest of the data is read in using scan(). So the problem is the format of the file.

One way of finding out, is to set the working directory to the directory where the file is. That way you see the extension of the file you read in. I know on Windows it's not shown standard, so you might believe it's csv while it isn't.

The next thing you should do, is open the file in Notepad or Wordpad (or another editor) and check that the format is equivalent to my file test.csv:

Test1,Test2,Test3
1,1,1
2,2,2
3,3,3
4,4,
5,5,
,6,

This file will give you the following dataframe :

> read.table(testfile,header=T,sep=",")
  Test1 Test2 Test3
1     1     1     1
2     2     2     2
3     3     3     3
4     4     4    NA
5     5     5    NA
6    NA     6    NA

The csv format saved by excel seperates all cells with a comma. Empty cells just don't have a value. read.table() can easily deal with this, and recognizes empty cells just fine.

Assuming this is a Windows 7 environment, if Kate looks at the file either copied to the desktop or inside the folder, the icon for a .csv file has an "a" on it, whereas an .xlsx file has an icon that looks more like a worksheet. This is a quick visual way of determining file type. Much easier to see when saved onto the desktop as the icons are larger. :)

'Incomplete final line' warning when trying to read a .csv file into R...

r
Rectangle 27 75

import csv
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('file.txt') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list
                                 # based on column name k

print(columns['name'])
print(columns['phone'])
print(columns['street'])

With a file like

name,phone,street
Bob,0893,32 Silly
James,000,400 McHilly
Smithers,4442,23 Looped St.
>>> 
['Bob', 'James', 'Smithers']
['0893', '000', '4442']
['32 Silly', '400 McHilly', '23 Looped St.']

Or alternatively if you want numerical indexing for the columns:

with open('file.txt') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
print(columns[0])

>>> 
['Bob', 'James', 'Smithers']
delimiter=" "
reader = csv.reader(f,delimiter=" ")

python - Read specific columns from a csv file with csv module? - Stac...

python csv
Rectangle 27 75

import csv
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('file.txt') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list
                                 # based on column name k

print(columns['name'])
print(columns['phone'])
print(columns['street'])

With a file like

name,phone,street
Bob,0893,32 Silly
James,000,400 McHilly
Smithers,4442,23 Looped St.
>>> 
['Bob', 'James', 'Smithers']
['0893', '000', '4442']
['32 Silly', '400 McHilly', '23 Looped St.']

Or alternatively if you want numerical indexing for the columns:

with open('file.txt') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
print(columns[0])

>>> 
['Bob', 'James', 'Smithers']
delimiter=" "
reader = csv.reader(f,delimiter=" ")

python - Read specific columns from a csv file with csv module? - Stac...

python csv
Rectangle 27 75

import csv
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('file.txt') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list
                                 # based on column name k

print(columns['name'])
print(columns['phone'])
print(columns['street'])

With a file like

name,phone,street
Bob,0893,32 Silly
James,000,400 McHilly
Smithers,4442,23 Looped St.
>>> 
['Bob', 'James', 'Smithers']
['0893', '000', '4442']
['32 Silly', '400 McHilly', '23 Looped St.']

Or alternatively if you want numerical indexing for the columns:

with open('file.txt') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
print(columns[0])

>>> 
['Bob', 'James', 'Smithers']
delimiter=" "
reader = csv.reader(f,delimiter=" ")

python - Read specific columns from a csv file with csv module? - Stac...

python csv
Rectangle 27 73

To read only the first row of the csv file use next() on the reader object.

with open('some.csv', newline='') as f:
  reader = csv.reader(f)
  row1 = next(reader)  # gets the first line
  # now do something here 
  # if first row is the header, then you can do one more next() to get the next row:
  # row2 = next(f)

or :

with open('some.csv', newline='') as f:
  reader = csv.reader(f)
  for row in reader:
    # do something here with `row`
    break

Thanks on the answer, but the second option not recommended.

file - How to read one single line of csv data in Python? - Stack Over...

python file csv iterator next
Rectangle 27 73

To read only the first row of the csv file use next() on the reader object.

with open('some.csv', newline='') as f:
  reader = csv.reader(f)
  row1 = next(reader)  # gets the first line
  # now do something here 
  # if first row is the header, then you can do one more next() to get the next row:
  # row2 = next(f)

or :

with open('some.csv', newline='') as f:
  reader = csv.reader(f)
  for row in reader:
    # do something here with `row`
    break

Thanks on the answer, but the second option not recommended.

file - How to read one single line of csv data in Python? - Stack Over...

python file csv iterator next
Rectangle 27 2

names
read_csv
df = pd.read_csv(CSV, names=['ID','CODE'])

List of column names to use. If file contains no header row, then you should explicitly pass header=None. Duplicates in this list are not allowed unless mangle_dupe_cols=True, which is the default.

python - pandas add columns when read from a csv file - Stack Overflow

python csv pandas
Rectangle 27 10

The smarter_csv gem was specifically created for this use-case: to read data from CSV file and quickly create database entries.

require 'smarter_csv'
  options = {}
  SmarterCSV.process('input_file.csv', options) do |chunk|
    chunk.each do |data_hash|
      Moulding.create!( data_hash )
    end
  end

You can use the option chunk_size to read N csv-rows at a time, and then use Resque in the inner loop to generate jobs which will create the new records, rather than creating them right away - this way you can spread the load of generating entries to multiple workers.

As the CSV class is included, I feel it's better to use it instead of adding or installing an additional gem. Granted, you didn't propose that a new gem be added to the application. It's so easy to add a series of individual gems, each for a specific purpose and before you know it your application has excessive dependencies. (I find myself consciously avoiding the addition of any gems. In my shop we need to justify the addition to our teammates.)

@Tass it's also pretty easy to add a series of individual methods, each for a specific purpose and before you know it your application has excessive logic that you have to maintain. If a gem works, is well maintained, and uses little resources or can be quarantined to the relevant environments (i.e. Staging for production tasks) it seems to me always a better option to use the gem. Ruby and Rails are all about writing less code.

I have the following error, do you know why? ActiveModel::UnknownAttributeError: unknown attribute 'siren;nom_ent;adresse;complement_adresse;cp_ville;pays;region;departement;activite;date;nb_salaries;nom;prenom;civilite;adr_mail;libele_acti;categorie;tel' for Transaction

I tried this on a rake task, console returns: rake aborted! NoMethodError: undefined method `close' for nil:NilClass stackoverflow.com/questions/42515043/

Ruby on Rails - Import Data from a CSV file - Stack Overflow

ruby-on-rails csv import
Rectangle 27 10

The smarter_csv gem was specifically created for this use-case: to read data from CSV file and quickly create database entries.

require 'smarter_csv'
  options = {}
  SmarterCSV.process('input_file.csv', options) do |chunk|
    chunk.each do |data_hash|
      Moulding.create!( data_hash )
    end
  end

You can use the option chunk_size to read N csv-rows at a time, and then use Resque in the inner loop to generate jobs which will create the new records, rather than creating them right away - this way you can spread the load of generating entries to multiple workers.

As the CSV class is included, I feel it's better to use it instead of adding or installing an additional gem. Granted, you didn't propose that a new gem be added to the application. It's so easy to add a series of individual gems, each for a specific purpose and before you know it your application has excessive dependencies. (I find myself consciously avoiding the addition of any gems. In my shop we need to justify the addition to our teammates.)

@Tass it's also pretty easy to add a series of individual methods, each for a specific purpose and before you know it your application has excessive logic that you have to maintain. If a gem works, is well maintained, and uses little resources or can be quarantined to the relevant environments (i.e. Staging for production tasks) it seems to me always a better option to use the gem. Ruby and Rails are all about writing less code.

I have the following error, do you know why? ActiveModel::UnknownAttributeError: unknown attribute 'siren;nom_ent;adresse;complement_adresse;cp_ville;pays;region;departement;activite;date;nb_salaries;nom;prenom;civilite;adr_mail;libele_acti;categorie;tel' for Transaction

I tried this on a rake task, console returns: rake aborted! NoMethodError: undefined method `close' for nil:NilClass stackoverflow.com/questions/42515043/

Ruby on Rails - Import Data from a CSV file - Stack Overflow

ruby-on-rails csv import
Rectangle 27 10

The smarter_csv gem was specifically created for this use-case: to read data from CSV file and quickly create database entries.

require 'smarter_csv'
  options = {}
  SmarterCSV.process('input_file.csv', options) do |chunk|
    chunk.each do |data_hash|
      Moulding.create!( data_hash )
    end
  end

You can use the option chunk_size to read N csv-rows at a time, and then use Resque in the inner loop to generate jobs which will create the new records, rather than creating them right away - this way you can spread the load of generating entries to multiple workers.

As the CSV class is included, I feel it's better to use it instead of adding or installing an additional gem. Granted, you didn't propose that a new gem be added to the application. It's so easy to add a series of individual gems, each for a specific purpose and before you know it your application has excessive dependencies. (I find myself consciously avoiding the addition of any gems. In my shop we need to justify the addition to our teammates.)

@Tass it's also pretty easy to add a series of individual methods, each for a specific purpose and before you know it your application has excessive logic that you have to maintain. If a gem works, is well maintained, and uses little resources or can be quarantined to the relevant environments (i.e. Staging for production tasks) it seems to me always a better option to use the gem. Ruby and Rails are all about writing less code.

I have the following error, do you know why? ActiveModel::UnknownAttributeError: unknown attribute 'siren;nom_ent;adresse;complement_adresse;cp_ville;pays;region;departement;activite;date;nb_salaries;nom;prenom;civilite;adr_mail;libele_acti;categorie;tel' for Transaction

I tried this on a rake task, console returns: rake aborted! NoMethodError: undefined method `close' for nil:NilClass stackoverflow.com/questions/42515043/

Ruby on Rails - Import Data from a CSV file - Stack Overflow

ruby-on-rails csv import
Rectangle 27 2

My guess would be that you ran into some encoding issue. Especially on Windows you can run into all sorts of problems with that. Try opening the csv file with a text editor that has the capability of saving files with various Encodings (e.g. Notpead++) then change that to e.g. UTF-8 (which is the preferred Encoding of RStudio most other Editors and R itself), save the file and try to run the import again. Just make sure that you don't loose characters - especially special characters tend to get lost during Encoding changes.

Thanks, looks like the encoding was the problem. I changed it to UTF-8 like you suggested and R reads the files perfectly. Thanks

How do I read in a CSV file into R? - Stack Overflow

r csv
Rectangle 27 3

It sounds like you need to use a data step to read in your CSV. There are lots of papers out there explaining how to do this, so I won't cover it here. This will allow you to specify the format (numeric/character) for each field. The nice thing here is you already know what formats they need to be in (from your existing dataset), so you can create this read in fairly easily.

data have;
informat x date9.;
input x y z $;
datalines;
10JAN2010 1 Base
11JAN2010 4 City
12JAN2010 8 State
;;;;
run;

Now, if you have a CSV of the same format, you can read it in by generating the input code from the above dataset. You can use PROC CONTENTS to do this, or you can generate it by using dictionary.tables which has the same information as PROC CONTENTS.

proc sql;
  select catx(' ',name,ifc(type='char', '$' ,' '))into :inputlist 
   separated by ' '
   from dictionary.columns
   where libname='WORK' and memname='HAVE';
  select catx(' ',name,informat) into :informatlist separated by ' '
   from dictionary.columns
   where libname='WORK' and memname='HAVE'
   and not missing(informat);
quit;

The above are two examples; they may or may not be sufficient for your particular needs.

data want;
infile datalines dlm=',';
informat &informatlist.;
input &inputlist.;
datalines;
13JAN2010,9,REGION
;;;;
run;

(obviously you would use your CSV file instead of datalines, just used here as example).

The point is you can write the data step code using the metadata from your original dataset.

Thanks Joe. The data step using the metadata is what I think I was missing.

import - Updating a local SAS table from a CSV file - Stack Overflow

csv import sas
Rectangle 27 13

We always used to use the Jet.OLEDB driver, until we started going to 64 bit applications. Microsoft has not and will not release a 64 bit Jet driver. Here's a simple solution we came up with that uses File.ReadAllLines and String.Split to read and parse the CSV file and manually load a DataTable. As noted above, it DOES NOT handle the situation where one of the column values contains a comma. We use this mostly for reading custom configuration files - the nice part about using CSV files is that we can edit them in Excel.

string CSVFilePathName = @"C:\test.csv";
string[] Lines = File.ReadAllLines(CSVFilePathName);
string[] Fields;
Fields = Lines[0].Split(new char[] { ',' });
int Cols = Fields.GetLength(0);
DataTable dt = new DataTable();
//1st row must be column names; force lower case to ensure matching later on.
for (int i = 0; i < Cols; i++)
    dt.Columns.Add(Fields[i].ToLower(), typeof(string));
DataRow Row;
for (int i = 1; i < Lines.GetLength(0); i++)
{
    Fields = Lines[i].Split(new char[] { ',' });
    Row = dt.NewRow();
    for (int f = 0; f < Cols; f++)
        Row[f] = Fields[f];
    dt.Rows.Add(Row);
}

c# - How to read a CSV file into a .NET Datatable - Stack Overflow

c# .net csv datatable
Rectangle 27 8

Reading a CSV file in very simple and common in Java. You actually don't require to load any extra third party library to do this for you. CSV (comma separated value) file is just a normal plain-text file, store data in column by column, and split it by a separator (e.g comma ",").

In order to read specific columns from the CSV file, there are several ways. Simplest of all is as below:

BufferedReader br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
    // use comma as separator
    String[] cols = line.split(cvsSplitBy);
    System.out.println("Coulmn 4= " + cols[4] + " , Column 5=" + cols[5]);
}

If you notice, nothing special is performed here. It is just reading a text file, and spitting it by a separator ",".

"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "TH" , Column 5="Thailand"

You can, in fact, put the columns in a Map and then get the values simply by using the key.

That simple? Your example breaks when values have commas in them. E.g. "1,0,0,0","1.0.0.255","16777216" would not work (but is a valid csv file). This is why using specifically designed apis can make your life easier, these edge cases have been thought of and (hopefully) tested.

java - Read CSV file column by column - Stack Overflow

java csv file-io multiple-columns
Rectangle 27 8

Reading a CSV file in very simple and common in Java. You actually don't require to load any extra third party library to do this for you. CSV (comma separated value) file is just a normal plain-text file, store data in column by column, and split it by a separator (e.g comma ",").

In order to read specific columns from the CSV file, there are several ways. Simplest of all is as below:

BufferedReader br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
    // use comma as separator
    String[] cols = line.split(cvsSplitBy);
    System.out.println("Coulmn 4= " + cols[4] + " , Column 5=" + cols[5]);
}

If you notice, nothing special is performed here. It is just reading a text file, and spitting it by a separator ",".

"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "TH" , Column 5="Thailand"

You can, in fact, put the columns in a Map and then get the values simply by using the key.

That simple? Your example breaks when values have commas in them. E.g. "1,0,0,0","1.0.0.255","16777216" would not work (but is a valid csv file). This is why using specifically designed apis can make your life easier, these edge cases have been thought of and (hopefully) tested.

java - Read CSV file column by column - Stack Overflow

java csv file-io multiple-columns