Ask Your Question
0

Dyntools: Dealing with many .OUT files when converting to .csv

asked 2019-10-11 07:23:54 -0500

TBOE gravatar image

updated 2019-10-15 01:28:30 -0500

Hi everybody, I'm running into errors when dealing with large numbers of .OUT files. I'm trying to convert a large number of .OUT files to .csv files for further processing in python. I use the code below, and I'm working in Spyder.

At a certain point, the following error messages start popping up:

All file units in use. genroe_con01_02.out (RWFIL)

(genroe_con01_02.out is one of the filenames) The error message pops up for all subsequent files. It seems like the problems start after 64 (2^6) files. which suggests that some kind of register is filling up. Though I have also seen 66 files being processed. Removing all variables and restarting the program will not do the trick, the only option is to restart the kernel.

edit: I have also noticed that after removing all variables and restarting, I'm also unable to load other PSSE files, receiving the error messages below. Again, requiring to restart the kernel.

All file units in use. PSSEcasedata.cnv (OpnApplFil/OPNPTI)

All file units in use. PSSEcasedata.snp (OpnApplFil/OPNPTI)

Has anybody come across this problem before?

files = os.listdir(os.curdir)
for filename in files:
    if filename.endswith(".out"):
        output_obj = dyntools.CHNF(filename)
        output_obj.csvout(channels='', csvfile='', outfile='', ntimestep=1)
        del output_obj
edit retag flag offensive close merge delete

Comments

Just curious, would you have the same problem if you use "txtout" instead of "csvout"?

drsgao gravatar imagedrsgao ( 2019-10-14 03:24:59 -0500 )edit

Thanks for the suggestion. I have given it a try, but see similar results. The problem arrises when reading the out-file. It seems like some process is keeping track of the out-files that have been read, even while the object is deleted.

TBOE gravatar imageTBOE ( 2019-10-15 01:11:43 -0500 )edit

Also, it seems like the problems start after 64 (2^6) files. which suggests that some kind of register fills up. Though I have also seen 66 files being processed. (I'll edit this in the question.)

TBOE gravatar imageTBOE ( 2019-10-15 01:15:57 -0500 )edit

Ok, see my answer (run the codes outside PSSE). Multiprocessing might help. By the way, I think the OUT file sizes may be related to either 64 files or 66 files.

drsgao gravatar imagedrsgao ( 2019-10-15 06:08:59 -0500 )edit

It might be worthwhile to look into the `pandas` library. I find that it makes managing data with tags (i.e. .out file name, channel name, and channel number) a lot easier, and it has some export to .csv methods built in.

kewasiuk gravatar imagekewasiuk ( 2019-10-16 11:55:56 -0500 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2019-10-15 06:09:08 -0500

drsgao gravatar image

updated 2019-10-15 06:11:16 -0500

I think this may be a regression problem since I have tested it with v33.4 with 128 OUT files (56MB each) and with "txtout" and got no problem. I don't have a higher version of PSSE so I could not test the "csvout" (Siemens says it starts to be available in v33.12.1). I would suggest you to raise a ticket with Siemens and see what's the official line. Note that I have seen a lot of bugs reported in v34, which is the version Siemens started to support Python 3 (I am not sure the exact version though).

In the mean time, you may be able to work around it using multiprocessing since the memory spaces are separated, and thus you may be able to keep the register in check (I suspect some kind of stack or memory issues though).

Again, I don't have the "csvout" in my version, so the following function is in "txtout". You would need to change it to "csvout" (you might be able to use the TXTs as CSVs if your channel labels are not too long).

# -*- coding: utf-8 -*-

'''
Testing the "dyntools" with multiprocessing to try to work around the
register fill-up issue.

Change Log
----------------------

* **Notable changes:**
   + Version : 1.0.0
      - First version

'''

import os
import glob
import multiprocessing as mp

# you may need to add more code here to be able to import this module
import dyntools


def out2txt(str_path_outfile):
    '''
    Very simple function using "dyntools" to convert an OUT file to an
    TXT file.

    Note that the "txtout" method can take more arguments, this function
    keeps the arguments at the minimum. Change the arguments as your
    needs.

    Parameters
    ------------
    str_path_outfile : str

        The full file path of the OUT file that is to be converted.

    Returns
    -------
    None.
    '''

    obj_chnf = dyntools.CHNF(str_path_outfile)

    obj_chnf.txtout(outfile=str_path_outfile)

    # just an feedback line, you can comment this out
    print(str_path_outfile + ' done!')


if __name__ == '__main__':

    # set your input directory path here, you can also use the "input"
    # function for better interactions
    str_path_dir = r'<your/dir/path>'

    # pattern for glob
    str_pattern = r'*.out'

    list_files = glob.glob(os.path.join(str_path_dir, str_pattern))

    # make a pool with 4 processes
    pool = mp.Pool(4)

    # pool of "out2txt" to deal with the OUT files, note that this will
    # block
    pool.map(out2txt, list_files)

    # some exit code, may be ok without them
    pool.close()

    pool.join()

    print('All Done!')

Tested with v33.4 with 128 OUT files (56MB each). All successful.

edit flag offensive delete link more

Comments

Thanks for this extensive response. I have tested your code on a large batch of files and was able to process 264 files using 4 processes. So this means 66 files per process. After those, the previously mentioned error message popped up again. So the problem persists, but this is a nice work around!

TBOE gravatar imageTBOE ( 2019-10-16 02:37:35 -0500 )edit

Another few comments; I'm running PSSE v34.4 from Spyder using pyhton 2.7. OUT file sizes shouldn't be a problem as they are only 20kB each. For now I'm good with a work around, but I might raise a ticket with Siemens. If I find out more, I'll post it here.

TBOE gravatar imageTBOE ( 2019-10-16 02:44:08 -0500 )edit

You might be able to increase the number of processes to see whether you could work around it a bit more. If you leave the pool blank, i.e., 'pool = mp.Pool()', it should use all available logical processors on your machine.

drsgao gravatar imagedrsgao ( 2019-10-16 03:00:04 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

[hide preview]

Question Tools

1 follower

Stats

Asked: 2019-10-11 07:23:54 -0500

Seen: 82 times

Last updated: Oct 15