1

I have this problem that I want to automate a script. And in passed projects I've used python scheduler for this. But for this project I'm unsure how to handle this.

The problem is that the code works with login details that are outside the code and entered in the commandline when launching the script.

ex. python scriptname.py [email protected] password

How can I automate this with python scheduler? The code that is in 'scriptname.py' is:

//LinkedBot.py import argparse, os, time import urlparse, random from selenium import webdriver from selenium.webdriver.common.keys import Keys from bs4 import BeautifulSoup def getPeopleLinks(page): links = [] for link in page.find_all('a'): url = link.get('href') if url: if 'profile/view?id=' in url: links.append(url) return links def getJobLinks(page): links = [] for link in page.find_all('a'): url = link.get('href') if url: if '/jobs' in url: links.append(url) return links def getID(url): pUrl = urlparse.urlparse(url) return urlparse.parse_qs(pUrl.query)['id'][0] def ViewBot(browser): visited = {} pList = [] count = 0 while True: #sleep to make sure everything loads, add random to make us look human. time.sleep(random.uniform(3.5,6.9)) page = BeautifulSoup(browser.page_source) people = getPeopleLinks(page) if people: for person in people: ID = getID(person) if ID not in visited: pList.append(person) visited[ID] = 1 if pList: #if there is people to look at look at them person = pList.pop() browser.get(person) count += 1 else: #otherwise find people via the job pages jobs = getJobLinks(page) if jobs: job = random.choice(jobs) root = 'http://www.linkedin.com' roots = 'https://www.linkedin.com' if root not in job or roots not in job: job = 'https://www.linkedin.com'+job browser.get(job) else: print "I'm Lost Exiting" break #Output (Make option for this) print "[+] "+browser.title+" Visited! \n("\ +str(count)+"/"+str(len(pList))+") Visited/Queue)" def Main(): parser = argparse.ArgumentParser() parser.add_argument("email", help="linkedin email") parser.add_argument("password", help="linkedin password") args = parser.parse_args() browser = webdriver.Firefox() browser.get("https://linkedin.com/uas/login") emailElement = browser.find_element_by_id("session_key-login") emailElement.send_keys(args.email) passElement = browser.find_element_by_id("session_password-login") passElement.send_keys(args.password) passElement.submit() 

Running this on OSX.

1
  • Is using cron an option? It's the traditional tool for the job Commented May 8, 2018 at 11:51

3 Answers 3

1

I can see at least two different way of automating the trigger of your script. Since you are mentioning that your script is started this way:

python scriptname.py [email protected] password 

It means that you start it from a shell. As you want to have it scheduled, it sounds like a Crontab is a perfect answer. (see https://kvz.io/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/ for example)

If you really want to use python scheduler, you can use the subprocess.

In your file using python scheduler:

import subprocess subprocess.call("python scriptname.py [email protected] password", shell=True) 

What is the best way to call a Python script from another Python script?

Sign up to request clarification or add additional context in comments.

Comments

0

About the code itself

LinkedIn REST Api

Have you tried using LinkedIn's REST Api instead of retrieving heavy pages, filling in some form and sending it back?

Your code is prone to be broken whenever LinkedIn changes some elements in their page. Whereas the Api is a contract between LinkedIn and the users.

Check here https://developer.linkedin.com/docs/rest-api and there https://developer.linkedin.com/docs/guide/v2/concepts/methods

Credentials

So that you don't have to pass your credentials through command line (especially your password, which will be readable in clear through history), you should either

  • use a config file (with your Api Key) and read it with ConfigParser (or anything else, depending on the format of your config file (json, python, etc...)
  • or set them into your environment variables.

For the scheduling

Using Cron

Moreover, for the scheduling part, you can use cron.

Using Celery

If you're looking for a 100% Python solution, you can use the excellent Celery project. Check its periodic tasks.

Comments

0

You can pass the args to the python scheduler.

scheduler.enter(delay, priority, action, argument=(), kwargs={}) Schedule an event for delay more time units. Other than the relative time, the other arguments, the effect and the return value are the same as those for enterabs(). Changed in version 3.3: argument parameter is optional. New in version 3.3: kwargs parameter was added.

>>> import sched, time >>> s = sched.scheduler(time.time, time.sleep) >>> def print_time(a='default'): ... print("From print_time", time.time(), a) ... >>> def print_some_times(): ... print(time.time()) ... s.enter(10, 1, print_time) ... s.enter(5, 2, print_time, argument=('positional',)) ... s.enter(5, 1, print_time, kwargs={'a': 'keyword'}) ... s.run() ... print(time.time()) ... >>> print_some_times() 930343690.257 From print_time 930343695.274 positional From print_time 930343695.275 keyword From print_time 930343700.273 default 930343700.276 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.