The example is complex enough that I won't debug it for you, but I write a lot of Python scripts that make system calls, and have a few pointers that can make your job easier.

  1. Have a look at the Python subprocess module. This gives you a lot more flexibility in running external processes. I have seen cases in which, for example, I can't get the external call to run using subprocess.run, but it will work with subprocess.Popen, or with os.system. The point is it helps to have a bag of tricks that you can draw upon.

  2. The < and > shell operators respectively redirect input from a file to a command, or output from a command to a file. I think what you want here is the pipe (|) operator, which pipes output from one command to be used as input for the next command.

  3. It looks like what you're trying to do is to take the output from a series of sort commands as input to the join command. Instead of parentheses, I think the shell operator you're looking for is the left quote. In shell, enclosing a command in left quotes returns the output of that command. For example, if you had a text file called acc.txt containing a list of Accession numbers

    RESULT=`cat acc.txt `

would assign to RESULT the output of cat acc.txt, that is the contents of the file would be stored in the environment variable RESULT.

(Note: On North American English keyboards, left quote (`) is on the same key as the tilda (~). This is a different character than the right quote ('), which would be on the same key as the double quote (").)

  1. If I was doing this, I might approach it in several possible ways. One is to take the series of commands and create a bash script that ran them. That way you can debug the script first and make sure it is working without the added complication of trying to get Python to call it. Also, that makes for a much simpler command string in your Python script.

The other way is to run each step as separate Python calls, saving intermediate results in temporary files and reading them back in for the next step. This sound like a lot of extra work, but is a lot easier to understand and debug.

Often a better approach is to write the process in Python. Python has a lot of sort methods, for example, and your join operation would also be easy to implement. That would save all the reading and writing to files.

Whatever you end up doing, break complex problems down into easy ones, and get each one to work before going on to the next. For example, spend the time to just get the first sort to work, and don't worry about the other steps. You'll find that this strategy is FASTER than trying to get a big complex thing to work all at once.

By the way, you get points from creating a command string as a separate step from running the command. Again, splitting up the steps makes the code more readable, and easier to debug.



Source link