Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all articles
Browse latest Browse all 9596

A Practical Use For Python Decorators ― Logging, Error Checks, and Timing

$
0
0

When using a python decorator, especially one defined in another library, they seem somewhat magical.Take for example Flask’s routing mechanism. If I put some statement like @app.route("/") above my logic, then poof, suddenly that code will be executedwhen I go to the root url on the server. And sure, decoratorsmake sense when you read the many tutorials out there that describe them. But for the most part, those tutorials are just explaining what’s going on, mostly by just printing out some text, but not why you might want to use a decorator yourself.

I was of that opinion before, but recently, I realized I have the perfect use for a decorator in a project of mine. In order to get the content for Product Mentions ,I have Python scrapers that gothrough Reddit looking for links to an Amazon product, and once I find one, I gather up the link, use the Amazon Product API to get information on the product. Once that’s in the database, I use Rails to display the items to the user.

While doingthe scraping, I also wanted a web interface so I can check to see errors, check to see how long the jobs are taking, and overall to see that I haven’t missed anything. So along with the actual Python script that grabs the html and parses it, I created a table in the database for logging the scraping runs, and update that for each job. Simple, and does the job I want.

The issue I come across here, and where decorators come into play, is code reuse. After some code refactoring, I have a few different jobs, all of which have the following format: Create an object for this job, commit it to the db so I can see that it’s running in real time, trysome code that depends on the job and except and log any error so we don’t crash that process, and then post the end time of the job.

def gather_comments(): scrape_log = ScrapeLog(start_time=datetime.now(), job_type="comments") session.add(scrape_log) session.commit() try: rg = RedditGatherer() rg.gather_comments() except Exception as e: scrape_log.error = True scrape_log.error_message = e.message scrape_log.end_time = datetime.now() session.add(scrape_log) session.commit() def gather_threads(): scrape_log = ScrapeLog(start_time=datetime.now(), job_type="threads") session.add(scrape_log) session.commit() try: rg = RedditGatherer() rg.gather_threads() except Exception as e: scrape_log.error = True scrape_log.error_message = e.message scrape_log.end_time = datetime.now() session.add(scrape_log) session.commit()

If you know a bit about how decorators work, you can already see how perfect an opportunity using this concept is here, becausedecorators allow you to extend and reuse functionality on top of functions you already use. For me, I want to log, time, and error check my scraping, and reusing the same code is not ideal. But a decorator is. Here’s how to write one.

Decorator Time

First thing to do,is writea function, that takes a function as parameter andcall that function at the appropriate time. Since the work of the functions above is done with the same format, this turns out really nice.

def gather_comments(): rg = RedditGatherer() rg.gather_comments() def log_and_time(function, job_type): scrape_log = ScrapeLog(start_time=datetime.now(), job_type=job_type) session.add(scrape_log) session.commit() try: function() #running code specific to this job except Exception as e: scrape_log.error = True scrape_log.error_message = e.message scrape_log.end_time = datetime.now() session.add(scrape_log) session.commit() log_and_time(gather_comments)

Very nice, and we could stop here if we wanted to even, but let’s not, because to run the gathering functions as it is now, we’d have to remember to wrap them in that log functionality What we really want is to just define the gather_XXXX functions, and know that whenever we use them, we’ll get the logging built in.

Functions That Return a Function

Since the end goal is to just have a function like gather_comments which I can use wherever and not have to worry about the log() wrapper, let’s try something different. Here, we’re defining the function log_and_time to take as parameter a function, which in the code we run in the same place in the logging as before. But this time, we return a callable function. So in the last line here, we set gather_comments to be that function.

def gather_comments_wrapped(): rg = RedditGatherer() rg.gather_comments() def log_and_time(function): def log(): scrape_log = ScrapeLog(start_time=datetime.now(), job_type="comments") session.add(scrape_log) session.commit() try: function() #running code specific to this job except Exception as e: scrape_log.error = True scrape_log.error_message = e.message scrape_log.end_time = datetime.now() session.add(scrape_log) session.commit() return log gather_comments = log_and_time(gather_comments_wrapped)

So now if we callgather_comments() like so we get the functionality we want!

Syntax for Decorators

Now that we have the code set up, we can use the fancy decorator syntax to avoid having that extra line of the code block above. So nice and simple.

def log_and_time(function): def log(): scrape_log = ScrapeLog(start_time=datetime.now(), job_type="comments") session.add(scrape_log) session.commit() try: function() #running code specific to this job except Exception as e: scrape_log.error = True scrape_log.error_message = e.message scrape_log.end_time = datetime.now() session.add(scrape_log) session.commit() return log @log_and_time def gather_comments(): rg = RedditGatherer() rg.gather_comments() Passing in Arguments

But wait! you say. That job_type parameter is hard coded to “comments” in the decorator function, and what if I have (like I do) a gather_threads function that searches for threads with amazon links? Luckily, we can also pass arguments into the decorator, with a little modification and another wrapper function.

def log_and_time(job_type): def log_decorator(function): def log_work(): print job_type scrape_log = ScrapeLog(start_time=datetime.now(), job_type=job_type) session.add(scrape_log) session.commit() try: log_info = function() except Exception as e: scrape_log.error = True scrape_log.error_message = e.message scrape_log.end_time = datetime.now() session.add(scrape_log) session.commit() return log_info #returning what the decorated function returns return log_work return log_decorator #returning the decorator function @log_and_time("comments") def gather_comments(): rg = RedditGatherer() rg.gather_comments() @log_and_time("threads") def gather_threads(): rg = RedditGatherer() rg.gather_threads()

Now think about what’s going on here first for a second, and you can see why this makes sense. Python is expecting whatever comes after that @ to be a function that takes a function as a parameter, and in the cases above it was. Now here, because of the parentheses, we’re calling the outermost function with an argument. We need to return a function that takes a function as argument from that function, and the modified version here does that.

Running as Background Job Like I wrote in mylast post, I’m running all this scraping as background jobs. But as of now,

Viewing all articles
Browse latest Browse all 9596

Trending Articles