Kettle PDI, Named Parameters, and Generic Database Connections

Kettle / PDI Generic SQL Server JTDS database with parameters and a Generic SQL Server Database Connection

The purpose of using named parameters is to abstract out what parts of a database connection change, from one environment to another, and make that part of a configuration that can be maintained elsewhere.  For example,  for a given ETL to extract Appointment data, it might need to connect to different database servers, databases, and use different credentials in Development, QA, and Production.

This example makes the database connection more generic than necessary for that purpose, just as a learning exercise.

First… Define a database connection and use replacement variables for all of the parameters. In this example, the following parameters are used.

  • DB_HOST
  • DB_DBNAME
  • DB_PORT
  • DB_USER
  • DB_PW

Continue reading

RoR: Using Yaml_db to copy Prod data/schema to Non-prod Environments

This morning, I took a snapshot of a production, Ruby on Rails database (via rake db:dump), and then applied the production / schema data to other environments.

    System information:

  • Database is Oracle 11
  • Application server runs Redhat Enterprise Linux
    Assumptions and challenges:

  • The Development and QA (Quality Assurance) instances are probably not at the same software revisions or migration version numbers. So… keep that in mind.
  • A firewall prevents the Dev and QA applications servers from accessing the Production database, and application server directly. So, a different method must be employed.
  • As with most enterprise type environments, developers do not have access to the production systems.

This methodology uses the gem yaml_db

Continue reading

Calling Ruby on Rails rake tasks through Appworx

I have a rake task “db:update_last_status”, which I need to have run on remote servers, nightly, to update cached status fields – which are used for fast search capabilities in an RoR application. The RoR application runs on Red Hat Enterprise Linux, an an Oracle database. It would be simple to script that with Cron, but that doesn’t lend itself to monitoring or work within the enterprise standards. Capistrano works great, but also isn’t within the Enterprise standards.

So… the rake task needs to be run through an Enterprise Automation System, called Appworx.
Appworx is used for running scheduled jobs on various servers, and multiple operating systems.

So far…. the Appworx config on this RHEL server entails

  1. Install Agent
  2. Configure agent
  3. Setup Appworx Job
    update_last_status_job Continue reading