$ go version
go version go1.7 linux/amd64
Reading of csv files is, out of the box, quite slow (tl;dr: 3x slower than a simple Java program, 1.5x slower than the obvious python code). A typical example:
package main import ( "bufio" "encoding/csv" "fmt" "io" "os" ) func main() { f, _ := os.Open("mock_data.csv") defer f.Close() r := csv.NewReader(f) for { line, err := r.Read() if err == io.EOF { break } if line[0] == "42" { fmt.Println(line) } } }Python3 equivalent:
import csv with open('mock_data.csv') as f: r = csv.reader(f) for row in r: if row[0] == "42": print(row)Equivalent Java code:
import java.io.BufferedReader; import java.io.FileReader; public class ReadCsv { public static void main(String[] args) { BufferedReader br; String line; try { br = new BufferedReader(new FileReader("mock_data.csv")); while ((line = br.readLine()) != null) { String[] data = line.split(","); if (data[0].equals("42")) { System.out.println(line); } } } catch (Exception e) {} } }Tested on a 50MB, 1'000'002 lines csv file generated as:
data = ",Carl,Gauss,cgauss@unigottingen.de,Male,30.4.17.77\n" with open("mock_data.csv", "w") as f: f.write("id,first_name,last_name,email,gender,ip_address\n") f.write(("1"+data)*int(1e6)) f.write("42"+data);Results:
Go: avg 1.489 secs Python: avg 0.933 secs (1.5x faster) Java: avg 0.493 secs (3.0x faster)Go error reporting is obviously better than the one you can have with that Java code, and I'm not sure about Python, but people has been complaining about encoding/csv slowness, so it's probably worth investigating whether the csv package can be made faster.