Wednesday, June 18, 2014

How to import Geonames data into PostgreSQL on Windows

  1. Download http://download.geonames.org/export/dump/allCountries.zip (251MB) to %ProgramData%\geonames\, and unzip it.
  2. In PostgreSQL, create a table to import the data into:
create table geonames (
  geoname_id int primary key,
  name text,
  ascii_name text,
  alternate_names text,
  latitude numeric(8,6),
  longitude numeric(9,6),
  feature_class char(1),
  feature_code text,
  country_code char(2),
  cc2 TEXT,
  admin1_code text,
  admin2_code text,
  admin3_code text,
  admin4_code text,
  population bigint,
  elevation int,
  dem text,
  timezone text,
  modification_date date
);

Run the COPY command:

copy geonames
from 'c:/programdata/geonames/allCountries/allCountries.txt'
with (
  format text,
  null ''
)

On my laptop it took ~65 seconds. There were 9,071,443 rows.

Let's say we just want to view the ISO 3166-1 countries. Lets's create an index on the feature_code column so we can quickly filter on it (warning, this can take a while):

create index on geonames (feature_code);

And now a view to show only the ISO countries:

create view iso_countries as 
select
*
from geonames 
where 
(
feature_code IN ( 'PCL', 'PCLD', 'PCLF', 'PCLI', 'PCLIX', 'PCLS' )
or geoname_id in (2461445, 5880801, 607072, 6697173) --western sahara, american samoa, svalbar and jan mayen, antarctica
)
and geoname_id <> 831053 --kosovo
order by
country_code;

Sunday, June 8, 2014

Pulling Exchange Rates From A JSON Web API with PostgreSQL

  1. This probably isn't a good idea on a busy server, as it might block
  2. Requires PL/PythonU
  3. Requires an App ID from openexchangerates.org, which is free (US base currency only)

A Function To Make An HTTP Request

create or replace function http_get(url text) returns text strict volatile language plpythonu AS $$

  import urllib2;
  
  return urllib2.urlopen(url).read();

$$;

A Function to Return a Monetary Amount in USD


CREATE TABLE exchange_rates_usd
(
  counter_currency char(3),
  close_date date,
  midmarket_rate numeric(19,10) NOT NULL,
  PRIMARY KEY (counter_currency, close_date)
);

create function usd(amount numeric(19,4), _counter_currency char(3), _close_date date = current_date) returns numeric(19,4) language plpgsql strict volatile as $$
declare
 result numeric(19,4);
 response json;
begin

 select 
  midmarket_rate 
 from exchange_rates_usd 
 where 
  counter_currency = _counter_currency 
  and close_date = _close_date 
 into result;

 if found then return amount / result; end if;

 response = http_get( format('https://openexchangerates.org/api/historical/%s.json?app_id=%s', _close_date, current_setting('my.OPENEXCHANGERATES_APP_ID') ) );

 insert into exchange_rates_usd (counter_currency, close_date, midmarket_rate) 
 select 
  rates.key::char(3)
  , _close_date
  , rates.value::text::numeric(19,10) 
 from json_each(response->'rates') rates;

 select 
  midmarket_rate 
 from exchange_rates_usd 
 where 
  counter_currency = _counter_currency 
  and close_date = _close_date 
 into result;

 return amount / result;

end $$;

The Output

/* you can set this value in postgresql.conf as well: */
set my.OPENEXCHANGERATES_APP_ID = 'whatever your app id is';

select usd(100, 'CAD', '2014-08-14');

91.66

Sunday, April 20, 2014

Broadcasting PostgreSQL NOTIFY messages to WebSocket Clients

In this post, I look at how to broadcast NOTIFY messages from a Postgres database to WebSocket clients using Spring MVC 4.

Source code here

Basically, if you change data in your database, it can notify web browser clients, without polling.

Technologies used:


The system works like this:

Client subscribes to a WebSocket topic...

NOTIFY event on database server ->
  PGNotificationListener on web server ->
      Send Websocket notification on server ->
         Receive Websocket event on browser. 

With the code below, if you call NOTIFY dml_events, 'some message'; in Postgres, it will be broadcast to all WebSocket clients.

Follow this answer  regarding proper listener setup

Project Structure:



pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.databasepatterns</groupId>
    <artifactId>pg-notify-websocket</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>war</packaging>
    <url>http://blog.databasepatterns.com/2014/04/postgresql-nofify-websocket-spring-mvc.html</url>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <spring.version>4.0.5.RELEASE</spring.version>
    </properties>

    <dependencies>

        <dependency>
            <!-- PostgreSQL JDBC driver -->
            <groupId>com.impossibl.pgjdbc-ng</groupId>
            <artifactId>pgjdbc-ng</artifactId>
            <version>0.3</version>
            <classifier>complete</classifier>
        </dependency>

        <dependency>
            <!-- Your JSON library -->
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>2.3.2</version>
        </dependency>

        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-jdbc</artifactId>
            <version>${spring.version}</version>
        </dependency>

        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-messaging</artifactId>
            <version>${spring.version}</version>
        </dependency>

        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-websocket</artifactId>
            <version>${spring.version}</version>
        </dependency>

        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-webmvc</artifactId>
            <version>${spring.version}</version>
        </dependency>

    </dependencies>

    <build>

        <plugins>

            <plugin>
                <groupId>org.apache.tomcat.maven</groupId>
                <artifactId>tomcat7-maven-plugin</artifactId>
                <version>2.2</version>
                <configuration>
                    <path>/</path>
                </configuration>
                <dependencies>
                    <dependency>
                        <groupId>org.apache.tomcat.embed</groupId>
                        <artifactId>tomcat-embed-websocket</artifactId>
                        <version>7.0.52</version>
                    </dependency>
                </dependencies>
            </plugin>

        </plugins>

    </build>

</project>

PGNotifyToWebSocket.java:

import com.impossibl.postgres.api.jdbc.PGConnection;
import com.impossibl.postgres.api.jdbc.PGNotificationListener;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.messaging.simp.SimpMessagingTemplate;

import javax.sql.DataSource;
import java.sql.Statement;


/**
 * @since 7/22/2014
 */
public class PGNotifyToWebSocket {

    @Autowired
    private SimpMessagingTemplate messagingTemplate;

    private PGConnection pgConnection;

    public PGNotifyToWebSocket(DataSource dataSource) throws Throwable {

        pgConnection = (PGConnection) dataSource.getConnection();

        pgConnection.addNotificationListener(new PGNotificationListener() {
            @Override
            public void notification(int processId, String channelName, String payload) {
                messagingTemplate.convertAndSend("/channels/" + channelName, payload);
            }
        });
    }

    public void init() throws Throwable {

        Statement statement = pgConnection.createStatement();
        statement.execute("LISTEN dml_events");
        statement.close();
    }

    public void destroy() throws Throwable {

        Statement statement = pgConnection.createStatement();
        statement.execute("UNLISTEN dml_events");
        statement.close();
    }
}
mvc-dispatcher-servlet.xml:

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:mvc="http://www.springframework.org/schema/mvc"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:websocket="http://www.springframework.org/schema/websocket"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
        http://www.springframework.org/schema/mvc http://www.springframework.org/schema/mvc/spring-mvc.xsd
        http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
        http://www.springframework.org/schema/websocket http://www.springframework.org/schema/websocket/spring-websocket.xsd
        ">

    <context:property-placeholder/>

    <context:component-scan base-package="com.databasepatterns"/>

    <mvc:annotation-driven/>

    <mvc:view-controller path="/" view-name="/WEB-INF/index.jsp"/>

    <bean class="com.impossibl.postgres.jdbc.PGDataSource" id="dataSource">
        <property name="host" value="${PGHOST:localhost}"/>
        <property name="database" value="${PGDATABASE:postgres}"/>
        <property name="password" value="${PGPASSWORD}"/>
        <property name="port" value="${PGPORT:5432}"/>
        <property name="user" value="${PGUSER:postgres}"/>
    </bean>

    <bean class="PGNotifyToWebSocket" init-method="init" destroy-method="destroy">
        <constructor-arg ref="dataSource"/>
    </bean>

    <websocket:message-broker application-destination-prefix="/app">
        <websocket:stomp-endpoint path="/hello">
            <websocket:sockjs/>
        </websocket:stomp-endpoint>
        <websocket:simple-broker prefix="/channels"/>
    </websocket:message-broker>

</beans>

index.jsp:

<html>

<body>

<p>Run <kbd>NOTIFY dml_events 'some message';</kbd> in Postgres (in the <code>$PGDATABASE</code> or <code>postgres</code> database). Tested with PG 9.3, on Windows 7, Chrome 36.</p>

<div id="out"></div>

<script src="//cdnjs.cloudflare.com/ajax/libs/sockjs-client/0.3.4/sockjs.min.js"></script>
<script src="//cdnjs.cloudflare.com/ajax/libs/stomp.js/2.3.3/stomp.min.js"></script>

<script>

      var socket = new SockJS("/hello");

      var stompClient = Stomp.over(socket);

      stompClient.connect( {}, function(frame) {

        stompClient.subscribe("/channels/dml_events", function(response) {
            document.getElementById("out").innerText += response + "\r\n\r\n";
        });

      });

</script>
</body>

</html>
web.xml:

<web-app version="3.0"
         xmlns="http://java.sun.com/xml/ns/javaee"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="
         http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd
         ">

    <servlet>
        <servlet-name>mvc-dispatcher</servlet-name>
        <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
    </servlet>

    <servlet-mapping>
        <servlet-name>mvc-dispatcher</servlet-name>
        <url-pattern>/</url-pattern>
    </servlet-mapping>

</web-app>

Monday, March 3, 2014

Some Basic Unit Test Patterns In Postgres

Let's say you want to write a unit test for PostgreSQL that expects an exception. Here's one way to do it:
create table my_table (
  id int primary key,
  name text not null
);


create or replace function test.throws_exception_when_name_is_null() returns void as $$
begin

  insert into my_table values (1, null);
  exception when not_null_violation then return; --swallow the exception, because we are hoping for it. 

end $$ language plpgsql;


select test.throws_exception_when_name_is_null();
Some other common exception names:
  • foreign_key_violation
  • unique_violation
  • check_violation
See the full list of exceptions

Let's say you want to run a timed test. Throw an exception if the test takes too long:
create or replace function a_long_running_function() returns void as $$
begin

    perform pg_sleep(2);

end $$ language plpgsql;


create or replace function test.test_timed() returns void as $$
begin

    perform a_long_running_function();

end $$ language plpgsql set statement_timeout to 1900;


select test.test_timed();

Friday, February 14, 2014

Trees and Paths using WITH RECURSIVE in PostgreSQL

Let's say we had some data like this - from the United Nations geoscheme for world regions:

create table subregions (
  id smallint primary key,
  name text not null,
  parent_id smallint null references subregions(id)
);

insert into subregions values
(1,'World',null),
(2,'Africa',1),
(5,'South America',419),
(9,'Oceania',1),
(11,'Western Africa',2),
(13,'Central America',419),
(14,'Eastern Africa',2),
(15,'Northern Africa',2),
(17,'Middle Africa',2),
(18,'Southern Africa',2),
(19,'Americas',1),
(21,'Northern America',19),
(29,'Caribbean',419),
(30,'Eastern Asia',142),
(34,'Southern Asia',142),
(35,'South-Eastern Asia',142),
(39,'Southern Europe',150),
(53,'Australia and New Zealand',9),
(54,'Melanesia',9),
(57,'Micronesia',9),
(61,'Polynesia',9),
(142,'Asia',1),
(143,'Central Asia',142),
(145,'Western Asia',142),
(150,'Europe',1),
(151,'Eastern Europe',150),
(154,'Northern Europe',150),
(155,'Western Europe',150),
(419,'Latin America and the Caribbean',19);
And you wanted to make a pretty tree like this:

World
   Africa
      Eastern Africa
      Middle Africa
      Northern Africa
      Southern Africa
      Western Africa
   Americas
      Latin America and the Caribbean
         Caribbean
         Central America
         South America
      Northern America
   Asia
      Central Asia
      Eastern Asia
      South-Eastern Asia
      Southern Asia
      Western Asia
   Europe
      Eastern Europe
      Northern Europe
      Southern Europe
      Western Europe
   Oceania
      Australia and New Zealand
      Melanesia
      Micronesia
      Polynesia

Here's how you'd do it:

with recursive my_expression as (
  
  --start with the "anchor", i.e. all of the nodes whose parent_id is null:
  select
    id, 
    name as path,
    name as tree,
    0 as level 
  from subregions
  where 
    parent_id is null

  union all

  --then the recursive part:
  select
    current.id as id,
    previous.path || ' > ' || current.name as path,
    repeat('   ', previous.level + 1) || current.name as tree,
    previous.level + 1 as level
  from subregions current 
  join my_expression as previous on current.parent_id = previous.id
)
select
  tree
from my_expression
order by 
  path

You can think of WITH RECURSIVE as a chain of UNION statements. A good explanation here: How does a Recursive CTE run, line by line?

You can also show paths like this:

select
  path 
from my_expression
order by
  path

World
World > Africa
World > Africa > Eastern Africa
World > Africa > Middle Africa
World > Africa > Northern Africa
World > Africa > Southern Africa
World > Africa > Western Africa
World > Americas
World > Americas > Latin America and the Caribbean
World > Americas > Latin America and the Caribbean > Caribbean
World > Americas > Latin America and the Caribbean > Central America
World > Americas > Latin America and the Caribbean > South America
World > Americas > Northern America
World > Asia
World > Asia > Central Asia
World > Asia > Eastern Asia
World > Asia > South-Eastern Asia
World > Asia > Southern Asia
World > Asia > Western Asia
World > Europe
World > Europe > Eastern Europe
World > Europe > Northern Europe
World > Europe > Southern Europe
World > Europe > Western Europe
World > Oceania
World > Oceania > Australia and New Zealand
World > Oceania > Melanesia
World > Oceania > Micronesia
World > Oceania > Polynesia
Fiddle with it.

Saturday, February 1, 2014

call NOTIFY on a DDL event

create or replace function on_ddl_event() returns event_trigger language plpgsql as $$
begin
  perform pg_notify('ddl_events', format('{"tg_tag":"%s","statement_timestamp":"%s"}', tg_tag, statement_timestamp()));
  return;
end $$;

create event trigger on_sql_drop on sql_drop
execute procedure on_ddl_event();

create event trigger on_ddl_command_end on ddl_command_end 
execute procedure on_ddl_event();

It might send a NOTIFY like this:

Asynchronous notification of 'ddl_events' received from backend pid 6940
   Data: {"tg_tag":"ALTER TABLE","statement_timestamp":"2014-02-01 01:31:23.651-08"}