Table of Contents |
---|
The CheckPointing software DMTCP https://dmtcp.sourceforge.io is now available in O2.
Note |
---|
DMTCP software is NOT guaranteed to work nor support all applications and languages. It might be possible that a given process will fail to run or restart from a saved checkpoint. |
...
Code Block | ||
---|---|---|
| ||
def main(): # something here for it in range(0,some_number_here): print(it) os.system('dmtcp_command --checkpoint') # do something here that takes # a very long time os.system('dmtcp_command --checkpoint') if __name__ == '__main__': main() |
Note |
---|
CAUTION: The creation of a checkpoint is a potentially time consuming process that can also generate very large files, depending on the RAM (memory) used by the running processes. When a checkpoint is created DMTCP will write to file all data currently loaded on RAM, therefore a job using ~100GB of RAM will create a similar size of data, which could fill up your storage quota. |
...