Kettle / PDI Generic SQL Server JTDS database with parameters and a Generic SQL Server Database Connection
The purpose of using named parameters is to abstract out what parts of a database connection change, from one environment to another, and make that part of a configuration that can be maintained elsewhere. For example, for a given ETL to extract Appointment data, it might need to connect to different database servers, databases, and use different credentials in Development, QA, and Production.
This example makes the database connection more generic than necessary for that purpose, just as a learning exercise.
First… Define a database connection and use replacement variables for all of the parameters. In this example, the following parameters are used.
As an experiment, I decided to convert content in a Drupal 7 website to WordPress.
These steps assume an empty WordPress database, with ‘admin’ as the primary user.
The first step was to insert the Posts into WordPress…
Below are some Hibernate/JPA JDBC Settings for MySQL and SQL Server.
The purpose is mostly to remind myself, but may it will save someone else a few minutes.
This morning, I took a snapshot of a production, Ruby on Rails database (via rake db:dump), and then applied the production / schema data to other environments.
- Database is Oracle 11
- Application server runs Redhat Enterprise Linux
Assumptions and challenges:
- The Development and QA (Quality Assurance) instances are probably not at the same software revisions or migration version numbers. So… keep that in mind.
- A firewall prevents the Dev and QA applications servers from accessing the Production database, and application server directly. So, a different method must be employed.
- As with most enterprise type environments, developers do not have access to the production systems.
This methodology uses the gem yaml_db
I have a rake task “db:update_last_status”, which I need to have run on remote servers, nightly, to update cached status fields – which are used for fast search capabilities in an RoR application. The RoR application runs on Red Hat Enterprise Linux, an an Oracle database. It would be simple to script that with Cron, but that doesn’t lend itself to monitoring or work within the enterprise standards. Capistrano works great, but also isn’t within the Enterprise standards.
So… the rake task needs to be run through an Enterprise Automation System, called Appworx.
Appworx is used for running scheduled jobs on various servers, and multiple operating systems.
So far…. the Appworx config on this RHEL server entails
- Install Agent
- Configure agent
- Setup Appworx Job