2

I have to write a program which checks if a particular directory on my folder has any files (of a specific extension), and if it finds any files, it reads them one by one and loads data from them into a database.

This is the rough algorithm in my mind:

  1. Using an infinite while() loop, continuously keep checking if the directory has any files of that particular extension (e.g. check if the directory has any *.xml files). I can use the PHP glob() function.

  2. If yes, then in a foreach loop, read data from each file and load it into the database.

  3. Once a file's data has been loaded, delete it.

My Question:

I will be constantly checking if there any .xml files in the directory. This means that many times I will get a true (meaning/saying "Yes, there are .xml files in the directory") even for the files whose data is BEING loaded.

So once a file has been found in the directory, I need a check which checks if its data is in the process of being loaded into a database. How do I check that?

The process of data-loading is that I extract useful data from the file into a .csv file and then use LOAD DATA INFILE SQL query to load the data into my MySQL database.

6
  • 1
    Since you need PHP code to determine whether an external process (mysql LOAD...) is reading a file, I think you're going to need to look at semaphores or mutexes. Lockfiles are a kind of semaphore. Commented Nov 12, 2016 at 13:43
  • @MikeSherrill'CatRecall' The file is first converted to .csv and then MySQL LOAD... is executed. I want to check if the file is BEING CONVERTED TO .csv. The file is converted to .csv file through a SHELL command (the commands we execute in Windows cmd or Linux terminal). Commented Nov 12, 2016 at 14:10
  • 1
    "I want to check if the file is BEING CONVERTED TO .csv." Why? If only your PHP code calls a shell script or batch file to convert xml to csv, there's no problem with concurrent access. Is your shell script converting xml to csv as a background job? Commented Nov 12, 2016 at 16:46
  • @MikeSherrill'CatRecall' An infinite while loop continuously checks if there is a .xml file in the directory. Now say, at this very moment, it finds a .xml file in the directory, and I start executing the shell command which extracts useful data from this .xml file and inserts that into a .csv file. (After the data is extracted from .xml and dumped into .csv file, the .xml file is deleted). Now since we are in an infinite while loop (which keeps checking if there is a .xml file in the directory), - continued in next comment! Commented Nov 12, 2016 at 19:08
  • 1
    "An infinite while loop continuously checks..." No, it doesn't. It runs a series of commands in a never-ending loop. One of those commands should probably be invoking a shell script and waiting for it to finish, rather than invoking a shell script in the background. Commented Nov 12, 2016 at 19:29

1 Answer 1

3

One solution is to use inotifywait as suggested in this answer: https://stackoverflow.com/a/6767891/2032943 to watch event and then act on them.

Also if you want to see that the file is already being used by some other command, you can use linux lsof command to check if there is an open handle for the file used by some process:

lsof | grep <filename> 

Note that these commands are specific to linux and will not work on windows.

Sign up to request clarification or add additional context in comments.

3 Comments

Firstly, thank you. Secondly, Doesn't the linux way make it OS dependent?
Yes the linux way makes it OS dependent. So the solution will not work for windows. Let me indicate that in the answer
Checking for a filename won't catch cases when the file is referenced using hard link or the file has been deleted. If you want to be extra sure, check the inode number using stat $filename, then lsof | grep $inode_number

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.