Saturday, August 16, 2014

Send email on error with pgAgent

pgAgent is a scheduler for Postgres. It's not bad, but doesn't have the ability to email an admin when there's an error.

Here's a hackish work-around to get that going.

Install pgAgent


On Windows, this is most easily done with Stack Builder from EnterpriseDB. Make sure the pgAgent service is running after you install pgAgent

(Optional) Install and set up a localhost Mail Transfer Agent (MTA)

Configuring your SMTP server settings in multiple programs isn't DRY, so it's better to set up a localhost MTA and have it relay to your SMTP server. It's also dumb to have your program wait for a response from the SMTP server. A localhost MTA gives you a free queue where you can drop a message in localhost:25 and not wait.

For developing/testing on Windows, you can use smtp4dev, which will act as a localhost MTA, but open incoming messages in your default email client, such as Outlook.

(Windows only) Get a command line SMTP client


I recommend blat. It's a simple command line SMTP client.

You can send an email like this

blat -t to@example.com -f from@example.com -subject "The Subject" -body "The body" -server host:port

Add blat to the PATH.

Create Some Views in the postgres database


Get the last time that the email job was run successfully, or -infinity:


create view email_job_last_success as 
  SELECT 
    COALESCE(
    ( SELECT 
      a.jlgstart
    FROM 
      pgagent.pga_joblog a
    JOIN 
      pgagent.pga_job b 
    ON 
      a.jlgjobid = b.jobid
    WHERE 
      b.jobname = 'email_job' 
      AND a.jlgstatus = 's' --success
    ORDER BY 
      a.jlgstart DESC
    LIMIT 1 ), 
    '-infinity'::timestamptz) AS jlgstart;

Get the list of steps that failed since the email job was last run successfully:

create view failed_steps AS 
  SELECT  
    *
  FROM 
    pgagent.pga_jobsteplog a
  JOIN 
    pgagent.pga_jobstep b 
  ON 
    a.jsljstid = b.jstid
  JOIN 
    pgagent.pga_job c 
  ON 
    b.jstjobid = c.jobid
  WHERE 
    jslstatus = 'f' --failure
    AND jobname <> 'email_job'
    AND jslstart > ( SELECT jlgstart FROM email_job_last_success)
  ORDER BY 
    jslstart DESC;

Create the Email Job


  1. Create a job called email_job (if you use a different name, you need to change the views above)
  2. Create a schedule for the job. Have it run say every five minutes. 
  3. Add a step
    1. Name: send_email , or whatever
    2. Connection type: local
    3. Database: postgres 
    4. Kind: SQL
    5. On error: Fail
    6. Definition:

do $$
begin

  /* if no failed steps, then don't send email: */
  if (select count(*) = 0 from failed_steps) then return; end if;

  copy (
    select
      jstdbname as "Database",
      jobname as "Job",
      jstname as "Step",
      jslstart as "Step Started At",
      jslresult as "Result Code",
      jsloutput as "Output"
    from
      failed_steps
  )
  to program 
    'blat -t dba@example.com -f pgAgent@example.com -server localhost:25 -subject "pgAgent job failure"' 
  with (
    format csv, 
    header true, 
    delimiter E'\t'
  );

end $$;

Change the line after "to program" above to customize your email message. Use sendmail instead of blat on Linux. The email job will run every X minutes. It will see if there are any steps (other than its own) that failed since the email job was last run successfully.

If there are failed steps, then it will COPY them (via stdin) to the email program as CSV. The CSV will appear as the email's body. The email program will send one email with all of the failed steps.

1 comment:

Anonymous said...
This comment has been removed by the author.