Ask Your Question
0

Multi-Threading using xlsout

asked 2018-08-15 18:50:32 -0500

haveityourwa gravatar image

updated 2018-08-15 22:26:24 -0500

Hey Guys, Long story short I am trying to extract results into excel quicker and am trying to multi-thread this process. However it seems that even though the multi-thread side of things works within my script only 1 file will properly extract and the rest will just be empty workbooks.

 Process Process-54:
Traceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\process.py", line 267, in _bootstrap
    self.run()
  File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Dropbox\Power System Services\Scripts\New folder\PYTHON\2.7\PSSE Python Scripts\PSSE\PSSE_MAT\DynDataExtractorMATMulti.py", line 179, in run_data_extract
    outfile=outfile1, sheet='', overwritesheet=True);
  File ".\dyntools.py", line 1084, in xlsout
  File ".\dyntools.py", line 428, in _xls_export
  File ".\excelpy.py", line 168, in <module>
  File "C:\Python27\lib\site-packages\win32com\client\__init__.py", line 11, in <module>
    import gencache
  File "C:\Python27\lib\site-packages\win32com\client\gencache.py", line 662, in <module>
    __init__()
  File "C:\Python27\lib\site-packages\win32com\client\gencache.py", line 56, in __init__
    _LoadDicts()
  File "C:\Python27\lib\site-packages\win32com\client\gencache.py", line 109, in _LoadDicts
    version = p.load()
EOFError

This is the error code that comes out. This is essentially the only line of the code which is causing these errors.

dyntools.CHNF(outfile1).xlsout(channels=signals, show=False, xlsfile=(PathDir + "\\sheets\\" + filename + str(fileNumber) + '.xlsx'),
                 outfile='', sheet='', overwritesheet=True);

Note this code works as intended if I reduce the thread count to 1 however that sort of defeats the point. Cheers

EDIT Thanks for the response EBahr. - I implemented your method of multi-threading the process and still recieving an error once I increase from 1 thread to any other number.

Traceback (most recent call last):
  File "C:/Dropbox/Power System Services/Scripts/New folder/PYTHON/2.7/PSSE Python Scripts/PSSE/PSSE_MAT/ExtractionPSSE.py", line 137, in <module>
    main()
  File "C:/Dropbox/Power System Services/Scripts/New folder/PYTHON/2.7/PSSE Python Scripts/PSSE/PSSE_MAT/ExtractionPSSE.py", line 115, in main
    results = out_pool.map(build_xls, processes)
  File "C:\Python27\lib\multiprocessing\pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "C:\Python27\lib\multiprocessing\pool.py", line 572, in get
    raise self._value
pywintypes.com_error: (-2147418111, 'Call was rejected by callee.', None, None)

Below is my implementation it simply requires a set of Out files named like so. (Please excuse my lack of structure :/) Results_0.out

Results_1.out

Results_2.out

Results_3.out

import sys
import os
import shutil
import Tkinter, tkFileDialog
import ntpath
import csv
import time
import multiprocessing
from multiprocessing import Pool as ThreadPool
import math
PSSE33BINPATH = r"C:/Program Files (x86)/PTI/PSSEUniversity33/pssbin;C:/Program Files (x86)/PTI/PSSEUniversity33/psslib;"

def main():

    # Prompt user for input

    root = Tkinter.Tk()
    root.withdraw()

    resultType = "Teebar"

    dirname = tkFileDialog.askopenfilename(parent=root, initialdir="/",
                                        title='Please select a _0.out File')
    PathDir = os.path.dirname(dirname)
    PathDir = os.path.abspath(os.path.join(PathDir, os.pardir))

    if not ...
(more)
edit retag flag offensive close merge delete

Comments

Your code works as intended for me. I wonder if you are running into a memory limitation or have an issue with you .out files?

EBahr gravatar imageEBahr ( 2018-08-16 09:58:38 -0500 )edit

Interesting, did you switch back to psse v33 ? Also how would I tell if I am hitting a memory limitation, I think my .out files are fine otherwise it would not work individually

haveityourwa gravatar imagehaveityourwa ( 2018-08-16 15:52:23 -0500 )edit

I switched to PSSE V34 and was able to get it to work with 2 threads, however it failed at more than this.

haveityourwa gravatar imagehaveityourwa ( 2018-08-16 18:21:45 -0500 )edit

Hi, I am trying to extract results into excel, the code That I use is datos_excel = dyntools.CHNF(archivo_out + '.out') #datos_excel.xlsout(channels='', show=False) But theese error is displayed   Server busy, this action can not be completed because the other program is busy, Choos

Juan gravatar imageJuan ( 2018-08-18 17:14:01 -0500 )edit

Hey Juan, I just posted an answer to your question. If you have some example code if you could add it to that question I would be happy to have a look at it.

haveityourwa gravatar imagehaveityourwa ( 2018-08-20 22:34:10 -0500 )edit

2 answers

Sort by » oldest newest most voted
1

answered 2018-08-17 11:41:41 -0500

jconto gravatar image

The code by Ebahr worked for multiple files with outfile=outfile instead of outfile='' in xlsout().

A version based on apply_async (below) allows for specifying a pool delay to minimize data collusion, noticed during saving data to disk:

import os, sys
import multiprocessing as mp
import math
import time

# The following may be different depending on PSSE version
PSSE33BINPATH = r"""C:\\Program Files (x86)\PTI\\PSSE33\\PSSBIN\\"""
sys.path.insert(0, PSSE33BINPATH)
import dyntools

resulst = []

def build_xls((outfile, signals)):
    if signals:
           print outfile, signals
    else:
         print outfile, '[ALL]'
    dyntools.CHNF(outfile).xlsout(channels=signals, show=False, outfile=outfile)
    return True

def call_back(result):
    # This is called whenever build_xls returns a result.
    # resulst is modified only by the main process, not by the pool workers.
    resulst.append(result)

def main():
    pooldelay  = 3      #delay between jobs, sec
    out_files  = ['savnw_cct2.out','savnw_cct3.out','savnw_cct4.out', 'savnw_cct5.out', 'savnw_cct6.out']
    signals    = [1,2,4,5,7,8] # some channels
    #signals    = []                        # to export ALL channels
    # Change number_of_threads to however many threads you want to use, this is just an example of using all threads in CPU
    number_of_threads = int(math.ceil(mp.cpu_count()))
    # Here I set number_of_threads to equal processes if I don't need to use all available threads
    if len(out_files) < number_of_threads:
       number_of_threads = len(out_files)

    # Loop through out files to run in parallel
    pool = mp.Pool(processes=number_of_threads)
    for out_file in out_files:
        arguments = (out_file, signals)
        pool.apply_async(build_xls, args = (arguments,), callback = call_back)
        time.sleep(pooldelay)
    pool.close()
    pool.join()  # Wait until all threads are finished

if __name__=='__main__':
    main()
edit flag offensive delete link more

Comments

Thanks I will have a look at this Monday, updating to 64 bit excel seemed to make a difference for me, if I was not utilising my computer I could run 8 threads with my above code.

haveityourwa gravatar imagehaveityourwa ( 2018-08-17 15:46:41 -0500 )edit

Hey The delay seems to add the stability and stop the errors from occuring thanks for helping with this, also the code from Ebahr and yourself has helped increase my understanding of the multiprocessing library.

haveityourwa gravatar imagehaveityourwa ( 2018-08-20 22:33:01 -0500 )edit
1

answered 2018-08-15 20:28:49 -0500

EBahr gravatar image

The following should get you close:

import sys
import os
import multiprocessing
from multiprocessing import Pool as ThreadPool
import math
PSSE33BINPATH = r"C:\Program Files (x86)\PTI\PSSE33\PSSBIN;C:\Program Files (x86)\PTI\PSSE33\PSSLIB;"

def main():
    processes = [] # setup blank list for processes
    out_files = ['out1.out', 'out2.out', 'out3.out'] # set this to however you are grabbing your out files
    file_number = 0 # not sure how you are giving file_number, so this is arbitrary
    signals = ['Time'] # not sure what signals you are grabbing, so this is just an example
    path_dir = r"C:\temp" # example
    file_name = "channels" # example
    # Change number_of_threads to how ever many threads you want to use, this is just an example of using all threads in CPU
    number_of_threads = int(math.ceil(multiprocessing.cpu_count()))

    # Loop through out files to setup an argument list
    for out_file in out_files:
        file_number += 1 # arbitrary
        arguments = (out_file, signals, path_dir, file_name, file_number)
        processes.append(arguments)

    # Here I set number_of_threads to equal processes if I don't need to use all available threads
    if len(processes) < number_of_threads:
        number_of_threads = len(processes)

    # Build pool and map your processes to it
    out_pool = ThreadPool(number_of_threads)
    results = out_pool.map(build_xls, processes)
    out_pool.close()
    out_pool.join()  # Wait until all threads are finished


def build_xls((outfile, signals, path_dir, file_name, file_number)):
    # The following may be different depending on PSSE version
    os.environ['PATH'] = PSSE33BINPATH + ';' + os.environ['PATH']
    sys.path.insert(0, PSSE33BINPATH)

    import dyntools

    excel_file = path_dir + "\\sheets\\" + file_name + str(file_number) + '.xlsx'
    dyntools.CHNF(outfile).xlsout(channels=signals, show=False, xlsfile=excel_file, outfile='', sheet='', overwritesheet=True)

if __name__=='__main__':
    main()
edit flag offensive delete link more

Comments

Hey Thank you so much for the response, you really did much more than i was expecting with only one line of code to go off of. However I implemented my code around what you have provided and again it only seems to work for 1 thread at a time. Is it possible that I may need certain settings in Excel?

haveityourwa gravatar imagehaveityourwa ( 2018-08-15 21:07:04 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

[hide preview]

Question Tools

Stats

Asked: 2018-08-15 18:50:32 -0500

Seen: 1,083 times

Last updated: Aug 17 '18