It’s better to have a dashboard page with all statistics/parameters of objects which are used in building your warehouse. By this administrator can judge/analyze/enhance few data issues very quickly on broader way.
1) User complains about not able to see latest data.
2) User complains about missing data which was exists previous day or in back days, but now missing.
3) Enables what the percentage of change on particular dimension or fact or stage tables.
4) Quickly find out if dimension or fact rows got reduced.
5) list goes on :)(please feel free to add)
To achieve this, I will illustrate you with one simple way and one advanced way.
Here is simple way, by make use of database dictionary objects, create a direct data base request with all required fields, for all objects used in building warehouse.
Here is simple way, by make use of database dictionary objects, create a direct data base request with all required fields, for all objects used in building warehouse.
Example: Table name, no of rows, last modified time.
SQL for above request:
select t.TABLE_NAME,t.NUM_ROWS,o.LAST_DDL_TIME
from user_objects o,user_tables t
where t.TABLE_NAME = o.OBJECT_NAME
where t.TABLE_NAME = o.OBJECT_NAME
Report look like as follows
By this administrator can judge particular table/dimension has not loaded today because of that latest data missed, if dimension rows are 0 then corresponding ETL process went for task,these kind of decision will be done easily.
From O.B.I.E.E |
By this administrator can judge particular table/dimension has not loaded today because of that latest data missed, if dimension rows are 0 then corresponding ETL process went for task,these kind of decision will be done easily.
Now little complex approach:
Here we maintain history of volume and will display this history table by direct database call, by doing this we can quickly identify if volume got reduced or volume did increased proportionately what we expect per ETL load/day/week etc.
Here we maintain history of volume and will display this history table by direct database call, by doing this we can quickly identify if volume got reduced or volume did increased proportionately what we expect per ETL load/day/week etc.
From O.B.I.E.E |
To do this we need to create a table in database and store statistics, update them as and when require i.e update the history tables after every ETL load or create a procedure and schedule it every day.
My recommendation would be maintain statistics in history table, for current day stats, amount of previous day to current day change, amount of last 7 days change, and last 30 days. So that if some tables are growing exponentially, we can fine tune them in well advance.
Here are sample scripts which I have used in above example.
CREATE TABLE SH.DBSTATS as
CREATE TABLE SH.DBSTATS as
select t1.TABLE_NAME ,
t1.NUM_ROWS as YESTERDAY_NO_ROWS,
t1.NUM_ROWS as TODAY_NO_ROWS
from user_tables t1
Here update/insert statement.
Here update/insert statement.
MERGE INTO SH.DBSTATS D
USING user_tables today
ON (today.TABLE_NAME = D.TABLE_NAME )
WHEN MATCHED THEN
UPDATE
SET
D.YESTERDAY_NO_ROWS = d.TODAYS_NO_ROWS,
d.TODAYS_NO_ROWS=today.NUM_ROWS
WHEN NOT MATCHED THEN
INSERT
USING user_tables today
ON (today.TABLE_NAME = D.TABLE_NAME )
WHEN MATCHED THEN
UPDATE
SET
D.YESTERDAY_NO_ROWS = d.TODAYS_NO_ROWS,
d.TODAYS_NO_ROWS=today.NUM_ROWS
WHEN NOT MATCHED THEN
INSERT
(D.Table_name,d.YESTERDAY_NO_ROWS,d.TODAYS_NO_ROWS)
VALUES
VALUES
(today.Table_name,today.NUM_ROWS,today.NUM_ROWS);
commit;
Note: By pass presentation cache.
commit;
Note: By pass presentation cache.
Explained every thing with Oracle technology.
Hope it helped you
Hope it helped you
Happy weekend :)
Srinivas Malyala