#Threading and Writing with CSV and DataFrames

1 messages · Page 1 of 1 (latest)

timber cipher
#

I've have a code that makes some simulations and then extracts data about them. Each simulation is independent, so I've contemplated to use Threads, wich work just fine, except for the 'writing the data' part. I was looking for a way to write the data in a single file, writing one row as soon as one simulation ends (as opposed to writing it after every simulation is finished) so I came up with this.

However the data file it writes is a mess. I don't care about the order of the rows it writes, but sometimes it return empty rows or mixes them together.
Is there a way to write on to the same file while using Theads?

wind oak
#

you cannot write to a file like this

#

the only way to do this concurrently is to first accumulate the results. One way, if you have n threads, is to make n files, have each thread write to their own file, then merge the file at the end

#

the other way which I would recommend is having a global container of strings or whatever your lines are, a good choice would be Threads.Channel, then have each thread put! their results into the channel, with Channel specifically the synchronization is done for you, julia makes sure each put is safe

#

then at the end of runtime once all threads finished, you, non-conurrently, have your program flush the channel contents into the file

#
const channel = Threads.Channel()

Threads.@threads for i in 1:n_runs
    # do simulation
    put!(channel, result)
end

# now flush channel 
while !isempty(channel)
    towrite = take!(channel)
    #write to csv here
end
#

you can also lock the file, this way you only have to change two lines in your code lol

const file_lock = Threads.ReentrantLock()

Threads.@threads for i in #...
  # do simulation
  @lock file_lock CSW.write(#...
end
#

but this would penalize performance slightly, the version above is most likely faster

timber cipher
timber cipher
wind oak
wind oak
#

but it can only be a single worker that does this