Lock free read only List in Python?

I've done some basic performance and memory consumption benchmarks and I was wondering if there is any way to make things even faster...

  1. I have a giant 70,000 element list with a numpy ndarray, and the file path in a tuple in the said list.

  2. My first version passed a sliced up copy of the list to each of the processes in python multiprocess module, but it would explode ram usage to over 20+ Gigabytes

  3. The second version I moved it into the global space and access it via index such as foo[i] in a loop in each of my processes which seems to put it into a shared memory area/CoW semantics with the processes thus it does not explode the memory usage (Stays at ~3 Gigabytes)

  4. However according to the performance benchmarks/tracing, it seems like the large majority of the application time is now spent in "acquire" mode...

So I was wondering if there is any way i can somehow turn this list into some sort of lockfree/read only so that I can do away with part of the acquire step to help speed up access even more.

Edit 1: Here's the top few line output of the profiling of the app

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   65 2450.903   37.706 2450.903   37.706 {built-in method acquire}
39320    0.481    0.000    0.481    0.000 {method 'read' of 'file' objects}
  600    0.298    0.000    0.298    0.000 {posix.waitpid}
   48    0.271    0.006    0.271    0.006 {posix.fork}

Edit 2: Here's a example of the list structure:

# Sample code for a rough idea of how the list is constructed
sim = []
for root, dirs, files in os.walk(rootdir):
    path = os.path.join(root, filename)
    image= Image.open(path)
    np_array = np.asarray(image)
    sim.append( (np_array, path) )

# Roughly it would look something like say this below
sim = List( (np.array([[1, 2, 3], [4, 5, 6]], np.int32), "/foobar/com/what.something") )

Then henceforth the SIM list is to be read only.

9
задан Pharaun 20 January 2011 в 07:26
поделиться