GoldenGate 12.2 – Installation of the monitoring agent

June 30, 2016, 12:50 am

≪ Previous: Question is: upgrade now to 12.1.0.2 or wait for 12.2 ?

As described in my last post, GoldenGate Studio requires the monitor agent on each GoldenGate instance. The goal of this agent is to allow GoldenGate Studio to interact with GoldenGate, for example, to deploy a new solution.

So lets start with the installation of the agent.

Download

The first step is to download the monitor agent (Oracle GoldenGate Monitor). It is available here.

Installation

The second step is to install the product.

After you have transferred the installer to the server decompress it.

oracle@srvoracle:/tmp/ [DB1] cd monitor/
oracle@srvoracle:/tmp/monitor/ [DB1] ll
total 420092
-rw-r--r-- 1 oracle oinstall 430166267 Feb 29 13:32 fmw_12.2.1.0.0_ogg_Disk1_1of1.zip
oracle@srvoracle:/tmp/monitor/ [DB1] unzip fmw_12.2.1.0.0_ogg_Disk1_1of1.zip 
Archive:  fmw_12.2.1.0.0_ogg_Disk1_1of1.zip
  inflating: fmw_12.2.1.0.0_ogg.jar  
oracle@srvoracle:/tmp/monitor/ [DB1] ll
total 840392
-rw-r--r-- 1 oracle oinstall 430166267 Feb 29 13:32 fmw_12.2.1.0.0_ogg_Disk1_1of1.zip
-r-xr-xr-x 1 oracle oinstall 430387063 Oct 14 08:33 fmw_12.2.1.0.0_ogg.jar

For launching the installer it is mandatory to have at a minimum Java version 8 (1.8). If this is not available it can be downloaded here.

To start the installation, launch the fmw_12.2.1.0.0_ogg.jar.

oracle@srvoracle:/home/oracle/Downloads/jdk1.8.0_73/bin/ [DB1] ./java -jar /tmp/monitor/fmw_12.2.1.0.0_ogg.jar 
Launcher log file is /tmp/OraInstall2016-02-29_01-39-26PM/launcher2016-02-29_01-39-26PM.log.
Extracting files.......
Starting Oracle Universal Installer

Checking if CPU speed is above 300 MHz.   Actual 2494.801 MHz    Passed
Checking monitor: must be configured to display at least 256 colors.   Actual 16777216    Passed
Checking swap space: must be greater than 512 MB.   Actual 4095 MB    Passed
Checking if this platform requires a 64-bit JVM.   Actual 64    Passed (64-bit not required)
Checking temp space: must be greater than 300 MB.   Actual 28817 MB    Passed


Preparing to launch the Oracle Universal Installer from /tmp/OraInstall2016-02-29_01-39-26PM
Log: /tmp/OraInstall2016-02-29_01-39-26PM/install2016-02-29_01-39-26PM.log
Logs successfully copied to /u01/app/oraInventory/logs.

The OUI (Oracle Universal Installer) will start. On the first screen just click on the next button.

On the next screen, we can choose the option for the updates. In my case, I leave the option to skip the auto updates.

Fill up the software location desired for GoldenGate agent.

Select the option to install only the agent monitor.

The OUI will test the system configuration and the java version.

The OUI provides a summary of the configuration. Click on next button, if all is ok.

The installation is done.

At the end OUI provides a summary of the installation with the location of the logs.

Now GoldenGate agent is installed.

Configuration

Create instance

To create the instance of the agent, go where the binaries have been installed. In this example, it is /u01/app/oracle/product/jagent/oggmon/ogg_agent.

After that, launch the script createMonitorAgentInstance.sh.

oracle@srvoracle:/u01/app/oracle/product/jagent/oggmon/ogg_agent/ [DB1] ./createMonitorAgentInstance.sh 
Please enter absolute path of Oracle GoldenGate home directory : /u01/app/oracle/product/12.1.0/gg_1
Please enter absolute path of OGG Agent instance : /u01/app/oracle/product/12.1.3.0/jagent
Please enter unique name to replace timestamp in startMonitorAgent script (startMonitorAgentInstance_20160229140552.sh) : 2
Sucessfully created OGG Agent instance.

Create password

The agent needs a password to work. All the passwords will be stored in a wallet. For this go to the ORACLE_HOME_AGENT/bin. In my case, /u01/app/oracle/product/12.1.3.0/jagent/bin.

Launch the script pw_agent_util.sh.

oracle@srvoracle:/u01/app/oracle/product/12.1.3.0/jagent/bin/ [DB1] ./pw_agent_util.sh -jagentonly
Please create a password for Java Agent: 
Please confirm password for Java Agent: 
Feb 29, 2016 2:18:55 PM oracle.security.jps.JpsStartup start
INFO: Jps initializing.
Feb 29, 2016 2:18:56 PM oracle.security.jps.JpsStartup start
INFO: Jps started.
Wallet is created successfully.

Enable monitoring

To enable the monitoring, launch ggsci command and edit the GOLBALS parameter file.

oracle@srvoracle:/u01/app/oracle/product/12.1.3.0/jagent/bin/ [DB1] ggi 

GGSCI (srvoracle) 2> edit params ./GLOBALS

GGSCI (srvoracle) 4> view params ./GLOBALS

GGSCHEMA ggadmin
CHECKPOINTTABLE ggadmin.checkpoint
ENABLEMONITORING

Now restart the ggsci command and the jagent appears when doing an “info all”.

oracle@srvoracle:/u01/app/oracle/product/12.1.3.0/jagent/bin/ [DB1] ggi

GGSCI (srvoracle) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           
JAGENT      STOPPED

To finalize, delete the datastore and recreate one.

oracle@srvoracle:/u01/app/oracle/product/12.1.3.0/jagent/ [DB1] ggi

GGSCI (srvoracle) 1> delete datastore
Are you sure you want to delete the datastore? yes

2016-02-29 14:33:30  INFO    OGG-06492  Datastore deleted.

GGSCI (srvoracle) 2> create datastore
Profile 'Trace' added.

2016-02-29 14:33:55  INFO    OGG-06489  Datastore created.

Now you can start the manager process and the jagent.

Conclusion

Now that the agents have been installed on each instance, all the prerequisite for GoldenGate Studio are met.

In the next blog, I will show you how to use GoldenGate Studio to deploy a solution

Cet article GoldenGate 12.2 – Installation of the monitoring agent est apparu en premier sur Blog dbi services.

↧

Script to suggest FK indexes

July 4, 2016, 8:55 am

≫ Next: Compare source and target in a Dbvisit replication

≪ Previous: GoldenGate 12.2 – Installation of the monitoring agent

In Oracle, when the referenced key is deleted (by delete on parent table, or update on the referenced columns) the child tables(s) are locked to prevent any concurrent insert that may reference the old key. This lock is a big issue on OLTP applications because it’s a TM Share lock, usually reserved for DDL only, and blocking any modification on the child table and blocking some modifications on tables that have a relationship with that child table. This problem can be be overcome when an index structure which allows to find concurrent inserts that may reference the old value. Here is the script I use to find which index is missing.

The idea is not to suggest to index all foreign keys for three reasons:

when there are no delete or update in parent side, you don’t have that locking issue
when there is minimal write activity on child side, the lock may not have big consequence
you probably have indexes build for performance reasons that can be used to avoid locking even when they have more columns or have different column order

The idea is not to suggest an index for each locking issue but only when blocking locks have been observed. Yes, it is a reactive solution, but proactive ones cannot be automatic. If you know your application well and then you know what you ave to index, then you don’t need this script. If you don’t, then proactive suggestion will suggest too many indexes.

Here is the kind of output that I get with this script:
-- DELETE on APP1.GCO_GOOD has locked APP1.FAL_TASK in mode 5 for 8 minutes between 14-sep 10:36 and 14-sep 10:44 -- blocked statement: DELETE FAL LOT LOT WHERE C FAB TYPE AND EXISTS SELECT -- blocked statement: UPDATE DOC POSITION DETAIL SET DOC POSITION DETAIL ID B -- blocked statement: delete from C AP GCO GOOD where rowid doa rowid -- blocked statement: DELETE FROM FAL LOT WHERE FAL LOT ID B -- blocked statement: DELETE FROM FAL TASK LINK PROP WHERE FAL LOT PROP ID B -- blocked statement: INSERT INTO FAL LOT PROGRESS FAL LOT PROGRESS ID FAL LOT -- blocked statement: insert into FAL TASK LINK FAL SCHEDULE STEP ID -- FK chain: APP1.GCO_GOOD referenced by(cascade delete) APP1"."GCO_SERVICE referenced by(cascade set null) APP1"."FAL_TASK (APP1.FAL_TASK_S_GCO_SERV) unindexed -- FK column GCO_GCO_GOOD_ID -- Suggested index: CREATE INDEX ON "APP1"."FAL_TASK" ("GCO_GCO_GOOD_ID"); -- Other existing Indexes: CREATE INDEX "APP1"."FAL_TASK_S_DIC_FR_TASK_COD7_FK" ON "APP1"."FAL_TASK" ("DIC_FREE_TASK_CODE7_ID") -- Other existing Indexes: CREATE INDEX "APP1"."FAL_TASK_S_DIC_FR_TASK_COD9_FK" ON "APP1"."FAL_TASK" ("DIC_FREE_TASK_CODE9_ID") -- Other existing Indexes: CREATE INDEX "APP1"."FAL_TASK_S_PPS_TOOLS13_FK" ON "APP1"."FAL_TASK" ("PPS_TOOLS13_ID")
I’ll detail each part.

ASH

Yes we have to detect blocking issues from the past and I use ASH for that. If you don’t have Diagnostic Pack, then you have to change the query with another way to sample V$SESSION.
-- DELETE on APP1.GCO_GOOD has locked APP1.FAL_TASK in mode 5 for 8 minutes between 14-sep 10:36 and 14-sep 10:44 -- blocked statement: DELETE FAL LOT LOT WHERE C FAB TYPE AND EXISTS SELECT -- blocked statement: UPDATE DOC POSITION DETAIL SET DOC POSITION DETAIL ID B -- blocked statement: delete from C AP GCO GOOD where rowid doa rowid -- blocked statement: DELETE FROM FAL LOT WHERE FAL LOT ID B -- blocked statement: DELETE FROM FAL TASK LINK PROP WHERE FAL LOT PROP ID B -- blocked statement: INSERT INTO FAL LOT PROGRESS FAL LOT PROGRESS ID FAL LOT -- blocked statement: insert into FAL TASK LINK FAL SCHEDULE STEP ID

The first part of the output comes from ASH and detects the blocking situations: which statement, how long, and the statements that were blocked.
This part of the script will probably need to be customized: I join with DBA_HIST_SQL_PLAN supposing that the queries have been captured by AWR as long running queries. I check last 15 days of ASH. You may change those to fit the blocking situation encountered.

Foreign Key

Then, we have to find the unindexed foreign key which is responsible for those locks.
-- FK chain: APP1.GCO_GOOD referenced by(cascade delete) APP1"."GCO_SERVICE referenced by(cascade set null) APP1"."FAL_TASK (APP1.FAL_TASK_S_GCO_SERV) unindexed -- FK column GCO_GCO_GOOD_ID
Here you see that it’s not easy. Actually, all scripts I’ve seen do not detect that situation where the CASCADE SET NULL cascades the issue. Here “APP1″.”GCO_SERVICE” has its foreign key indexed but the SET NULL, even when not on the referenced column, locks the child (for no reason as far as I know, but it does).
My script goes up to a level 10 using a connect by query to detect this situation.

Suggested Index

The suggested index is an index on the foreign key column:
-- Suggested index: CREATE INDEX ON "APP1"."FAL_TASK" ("GCO_GCO_GOOD_ID");

This is only a suggestion. Any regular index that starts with foreign key column in whatever order can be used to avoid the lock.
Remember to think about performance first. The index may be used to navigate from parent to child.

Existing Index

Finally, when adding an index it’s good to check if there are other indexe that would not be needed anymore, so my script displays all of them.
If you think that some indexes are not required, remember that in 12c you can make them invisible for a while and you will be able to bring them back to visible quickly in case of regression.

Script

Here is the script. Sorry, no comments on it yet and a few display things to change. Just try it, it’s just a query on AWR (need Diag. Pack) and table/index/constraint metadata. You can customize it and don’t hesitate to comment if you have ideas to improve. I used it in several environments and it has always found the chain of foreign key that is responsible for an ‘enq: TM’ blocking situation. And believe me this is not always easy to do just by looking at the data model.

set serveroutput on declare procedure print_all(s varchar2) is begin null; dbms_output.put_line(s); end; procedure print_ddl(s varchar2) is begin null; dbms_output.put_line(s); end; begin dbms_metadata.set_transform_param(dbms_metadata.session_transform,'SEGMENT_ATTRIBUTES',false); for a in ( select count(*) samples, event,p1,p2,o.owner c_owner,o.object_name c_object_name,p.object_owner p_owner,p.object_name p_object_name,id,operation,min(p1-1414332420+4) lock_mode,min(sample_time) min_time,max(sample_time) max_time,ceil(10*count(distinct sample_id)/60) minutes from dba_hist_active_sess_history left outer join dba_hist_sql_plan p using(dbid,sql_id) left outer join dba_objects o on object_id=p2 left outer join dba_objects po on po.object_id=current_obj# where event like 'enq: TM%' and p1>=1414332420 and sample_time>sysdate-15 and p.id=1 and operation in('DELETE','UPDATE','MERGE') group by event,p1,p2,o.owner,o.object_name,p.object_owner,p.object_name,po.owner,po.object_name,id,operation order by count(*) desc ) loop print_ddl('-- '||a.operation||' on '||a.p_owner||'.'||a.p_object_name||' has locked '||a.c_owner||'.'||a.c_object_name||' in mode '||a.lock_mode||' for '||a.minutes||' minutes between '||to_char(a.min_time,'dd-mon hh24:mi')||' and '||to_char(a.max_time,'dd-mon hh24:mi')); for s in ( select distinct regexp_replace(cast(substr(sql_text,1,2000) as varchar2(60)),'[^a-zA-Z ]',' ') sql_text from dba_hist_active_sess_history join dba_hist_sqltext t using(dbid,sql_id) where event like 'enq: TM%' and p2=a.p2 and sample_time>sysdate-90 ) loop print_all('-- '||'blocked statement: '||s.sql_text); end loop; for c in ( with c as ( select p.owner p_owner,p.table_name p_table_name,c.owner c_owner,c.table_name c_table_name,c.delete_rule,c.constraint_name from dba_constraints p join dba_constraints c on (c.r_owner=p.owner and c.r_constraint_name=p.constraint_name) where p.constraint_type in ('P','U') and c.constraint_type='R' ) select c_owner owner,constraint_name,c_table_name,connect_by_root(p_owner||'.'||p_table_name)||sys_connect_by_path(decode(delete_rule,'CASCADE','(cascade delete)','SET NULL','(cascade set null)',' ')||' '||c_owner||'"."'||c_table_name,' referenced by') foreign_keys from c where level<=10 and c_owner=a.c_owner and c_table_name=a.c_object_name connect by nocycle p_owner=prior c_owner and p_table_name=prior c_table_name and ( level=1 or prior delete_rule in ('CASCADE','SET NULL') ) start with p_owner=a.p_owner and p_table_name=a.p_object_name ) loop print_all('-- '||'FK chain: '||c.foreign_keys||' ('||c.owner||'.'||c.constraint_name||')'||' unindexed'); for l in (select * from dba_cons_columns where owner=c.owner and constraint_name=c.constraint_name) loop print_all('-- FK column '||l.column_name); end loop; print_ddl('-- Suggested index: '||regexp_replace(translate(dbms_metadata.get_ddl('REF_CONSTRAINT',c.constraint_name,c.owner),chr(10)||chr(13),' '),'ALTER TABLE ("[^"]+"[.]"[^"]+") ADD CONSTRAINT ("[^"]+") FOREIGN KEY ([(].*[)]).* REFERENCES ".*','CREATE INDEX ON \1 \3;')); for x in ( select rtrim(translate(dbms_metadata.get_ddl('INDEX',index_name,index_owner),chr(10)||chr(13),' ')) ddl from dba_ind_columns where (index_owner,index_name) in (select owner,index_name from dba_indexes where owner=c.owner and table_name=c.c_table_name) and column_name in (select column_name from dba_cons_columns where owner=c.owner and constraint_name=c.constraint_name) ) loop print_ddl('-- Existing candidate indexes '||x.ddl); end loop; for x in ( select rtrim(translate(dbms_metadata.get_ddl('INDEX',index_name,index_owner),chr(10)||chr(13),' ')) ddl from dba_ind_columns where (index_owner,index_name) in (select owner,index_name from dba_indexes where owner=c.owner and table_name=c.c_table_name) ) loop print_all('-- Other existing Indexes: '||x.ddl); end loop; end loop; end loop; end; /

I didn’t take time to document/comment the script but don’t hesitate to ask what you don’t understand there.

You should not see any ‘enq: TM’ from an OLTP application. If you have them, even short, they will become problematic one day. It’s the kind of thing that can block the whole database.

Cet article Script to suggest FK indexes est apparu en premier sur Blog dbi services.

↧

Compare source and target in a Dbvisit replication

July 5, 2016, 11:51 am

≫ Next: Nulls in composite keys

≪ Previous: Script to suggest FK indexes

You’ve setup a logical replication, and you trust it. But before the target goes into production, it will be safer to compare source and target. At least count the number of rows.
But tables are continuously changing, so how can you compare? Not so difficult thanks to Dbvisit replicate heartbeat table and Oracle flashback query.

Here is the state of the replication, with activity on the source and real-time replication to the target:
| Dbvisit Replicate 2.7.06.4485(MAX edition) - Evaluation License expires in 29 days MINE IS running. Currently at plog 368 and SCN 6119128 (07/06/2016 04:15:21). APPLY IS running. Currently at plog 368 and SCN 6119114 (07/06/2016 04:15:19). Progress of replication dbvrep_XE:MINE->APPLY: total/this execution -------------------------------------------------------------------------------------------------------------------------------------------- REPOE.CUSTOMERS: 100% Mine:961/961 Unrecov:0/0 Applied:961/961 Conflicts:0/0 Last:06/07/2016 04:12:12/OK REPOE.ADDRESSES: 100% Mine:961/961 Unrecov:0/0 Applied:961/961 Conflicts:0/0 Last:06/07/2016 04:12:12/OK REPOE.CARD_DETAILS: 100% Mine:894/894 Unrecov:0/0 Applied:894/894 Conflicts:0/0 Last:06/07/2016 04:12:12/OK REPOE.ORDER_ITEMS: 100% Mine:5955/5955 Unrecov:0/0 Applied:5955/5955 Conflicts:0/0 Last:06/07/2016 04:12:12/OK REPOE.ORDERS: 99% Mine:4781/4781 Unrecov:0/0 Applied:4780/4780 Conflicts:0/0 Last:06/07/2016 04:12:12/OK REPOE.INVENTORIES: 100% Mine:5825/5825 Unrecov:0/0 Applied:5825/5825 Conflicts:0/0 Last:06/07/2016 04:12:12/OK REPOE.LOGON: 99% Mine:6175/6175 Unrecov:0/0 Applied:6173/6173 Conflicts:0/0 Last:06/07/2016 04:12:12/OK -------------------------------------------------------------------------------------------------------------------------------------------- 7 tables listed.

If you wand to compare the rows from source and target, you will always see a difference because modifications on source arrive on target a few seconds later.

Source and target SCN

The first thing to do is to determine a consistent point in time where source and target are the same. This point in time exists because the redo log is sequential by nature, and the commits are done in the same order on target than source. And this order is visible with the SCN. The only problem is that on a logical replication the SCN on source and target are completely different and have their own life.

The first step is determine an SCN from the target and an SCN on the source that show the same state of transactions.

But before that, let’s connect to the target and set the environment:

$ sqlplus /nolog @ compare.sql SQL*Plus: Release 11.2.0.2.0 Production on Tue Jul 5 18:15:34 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production SQL> define table_owner=REPOE SQL> define table_name=ORDERS SQL> SQL> connect system/manager@//192.168.56.67/XE Connected. SQL> alter session set nls_date_format='DD-MON-YYYY HH24:mi:ss'; Session altered. SQL> alter session set nls_timestamp_format='DD-MON-YYYY HH24:mi:ss'; Session altered.

My example is on the #repattack environment, with Swingbench running on the source, and I’ll compare the ORDER table.

Heartbeat table

Each Dbvisit replicate configuration comes with an heartbeat table created in the Dbvisit schema on the source and replicated to the target. This table is updated every 10 seconds on the source with timestamp and SCN. This is a great way to check how the replication is working.Here it will be the way to get the SCN information from the source.

Flashback query

Oracle flashback query offers a nice way to get the commit SCN for the rows updated in the heartbeat table. From the target database, this is the commit SCN for the replication transaction (the APPLY process) and it can be displayed along with the SCN from the source transaction that is recorded in the heartbeat table and replicated to the target.

SQL> column versions_startscn new_value scn_target SQL> column source_scn new_value scn_source SQL> column mine_process_name format a12 SQL> column versions_starttime format a21 SQL> select mine_process_name,wallclock_date,mine_date,source_scn,mine_scn,versions_startscn,versions_starttime,versions_endscn from DBVREP.DBRSCOMMON_HEARTBEAT versions between timestamp(sysdate-1/24/60) and sysdate order by versions_endscn nulls last ; MINE_PROCESS WALLCLOCK_DATE MINE_DATE SOURCE_SCN MINE_SCN VERSIONS_STARTSCN VERSIONS_STARTTIME VERSIONS_ENDSCN ------------ -------------------- -------------------- -------------------- -------------------- -------------------- --------------------- -------------------- MINE 06-JUL-2016 04:14:27 06-JUL-2016 04:14:22 6118717 6118661 4791342 MINE 06-JUL-2016 04:14:37 06-JUL-2016 04:14:31 6118786 6118748 4791342 06-JUL-2016 04:11:29 4791376 MINE 06-JUL-2016 04:14:47 06-JUL-2016 04:14:41 6118855 6118821 4791376 06-JUL-2016 04:11:39 4791410 MINE 06-JUL-2016 04:14:57 06-JUL-2016 04:14:51 6118925 6118888 4791410 06-JUL-2016 04:11:49 4791443 MINE 06-JUL-2016 04:15:07 06-JUL-2016 04:15:01 6119011 6118977 4791443 06-JUL-2016 04:11:59 4791479 MINE 06-JUL-2016 04:15:17 06-JUL-2016 04:15:11 6119091 6119059 4791479 06-JUL-2016 04:12:09 4791515 MINE 06-JUL-2016 04:15:27 06-JUL-2016 04:15:21 6119162 6119128 4791515 06-JUL-2016 04:12:19

This shows that the current version of the heartbeat table on target was commited at SCN 4791515 and we know that this state matches the SCN 6119162 on the source. You can choose any pair you want but the latest will probably be the fastest to query.

Counting rows on source

I’ll use flashback query to count the rows from the source at SCN 6119162. I’m doing it in parallel query, but be careful when the table has high modification activity there will be lot of undo blocks to read.

SQL> connect system/manager@//192.168.56.66/XE Connected. SQL> alter session force parallel query parallel 8; Session altered. SQL> select count(*) from "&table_owner."."&table_name." as of scn &scn_source; old 1: select count(*) from "&table_owner."."&table_name." as of scn &scn_source new 1: select count(*) from "REPOE"."ORDERS" as of scn 6119162 COUNT(*) -------------------- 775433

Counting rows on target

I’m doing the same fron the target, but with the SCN 4791515
SQL> connect system/manager@//192.168.56.67/XE Connected. SQL> alter session force parallel query parallel 8; Session altered. SQL> select count(*) from "&table_owner."."&table_name." as of scn &scn_target; old 1: select count(*) from "&table_owner."."&table_name." as of scn &scn_target new 1: select count(*) from "REPOE"."ORDERS" as of scn 4791515 COUNT(*) -------------------- 775433

Good. Same number of rows. This proves that even with constantly inserted tables we can find a point of comparison, thanks to Dbvisit heartbeat table and thanks to Oracle flashback query. If you are replicating with another logical replication product, you can simulate the heartbeat table with a job that updates the current SCN to a single row table, and replicate it. If your target is not Oracle, then there are good chances that you cannot do that kind of ‘as of’ query which means that you need to lock the table on source for the time you compare.

ORA_HASH

If you think that counting the rows is not sufficient, you can compare a hash value from the columns. Here is an example.
I get the list of columns, with ORA_HASH() function on it, and sum() between them:

SQL> column columns new_value columns SQL> select listagg('ORA_HASH('||column_name||')','+') within group (order by column_name) columns 2 from dba_tab_columns where owner='&table_owner.' and table_name='&table_name'; old 2: from dba_tab_columns where owner='&table_owner.' and table_name='&table_name' new 2: from dba_tab_columns where owner='REPOE' and table_name='ORDERS' COLUMNS -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ORA_HASH(CARD_ID)+ORA_HASH(COST_OF_DELIVERY)+ORA_HASH(CUSTOMER_CLASS)+ORA_HASH(CUSTOMER_ID)+ORA_HASH(DELIVERY_ADDRESS_ID)+ORA_HASH(DELIVERY_TYPE)+ORA_HASH(INVOICE_ADDRESS_ID)+ORA_HASH(ORDER_DATE)+ORA_ HASH(ORDER_ID)+ORA_HASH(ORDER_MODE)+ORA_HASH(ORDER_STATUS)+ORA_HASH(ORDER_TOTAL)+ORA_HASH(PROMOTION_ID)+ORA_HASH(SALES_REP_ID)+ORA_HASH(WAIT_TILL_ALL_AVAILABLE)+ORA_HASH(WAREHOUSE_ID)

With this list defined in a substitution variable, I can compare the sum of hash values:

SQL> select count(*),avg(&columns.) hash from "&table_owner."."&table_name." as of scn &scn_target; old 1: select count(*),sum(&columns.) hash from "&table_owner."."&table_name." as of scn &scn_target new 1: select count(*),sum(ORA_HASH(CARD_ID)+ORA_HASH(COST_OF_DELIVERY)+ORA_HASH(CUSTOMER_CLASS)+ORA_HASH(CUSTOMER_ID)+ORA_HASH(DELIVERY_ADDRESS_ID)+ORA_HASH(DELIVERY_TYPE)+ORA_HASH(INVOICE_ADDRESS_ID)+ORA_HASH(ORDER_DATE)+ORA_HASH(ORDER_ID)+ORA_HASH(ORDER_MODE)+ORA_HASH(ORDER_STATUS)+ORA_HASH(ORDER_TOTAL)+ORA_HASH(PROMOTION_ID)+ORA_HASH(SALES_REP_ID)+ORA_HASH(WAIT_TILL_ALL_AVAILABLE)+ORA_HASH(WAREHOUSE_ID)) hash from "REPOE"."ORDERS" as of scn 4791515 COUNT(*) HASH -------------------- -------------------- 775433 317531150805040439 SQL> connect system/manager@//192.168.56.66/XE Connected. SQL> alter session force parallel query parallel 8; Session altered. SQL> select count(*),avg(&columns.) hash from "&table_owner."."&table_name." as of scn &scn_source; old 1: select count(*),sum(&columns.) hash from "&table_owner."."&table_name." as of scn &scn_source new 1: select count(*),sum(ORA_HASH(CARD_ID)+ORA_HASH(COST_OF_DELIVERY)+ORA_HASH(CUSTOMER_CLASS)+ORA_HASH(CUSTOMER_ID)+ORA_HASH(DELIVERY_ADDRESS_ID)+ORA_HASH(DELIVERY_TYPE)+ORA_HASH(INVOICE_ADDRESS_ID)+ORA_HASH(ORDER_DATE)+ORA_HASH(ORDER_ID)+ORA_HASH(ORDER_MODE)+ORA_HASH(ORDER_STATUS)+ORA_HASH(ORDER_TOTAL)+ORA_HASH(PROMOTION_ID)+ORA_HASH(SALES_REP_ID)+ORA_HASH(WAIT_TILL_ALL_AVAILABLE)+ORA_HASH(WAREHOUSE_ID)) hash from "REPOE"."ORDERS" as of scn 6119162 COUNT(*) HASH -------------------- -------------------- 775433 17531150805040439

Note that this is only an example. You must adapt for your needs: precision of the comparison and performance.

So what?

Comparing source and target is not a bad idea. With Dbvisit replicate, if you defined the replication properly and did the initial import with the SCN provided by the setup wizard, you should not miss transactions, even when there is lot of activity on source, and even without locking the source for the initialisation. But it’s always good to compare, especially before the ‘Go’ decision of a migration done with Dbvisit replicate to lower the downtime (and the stress). Thanks to heartbeat table and flashback query, a checksum is not too hard to implement.

Cet article Compare source and target in a Dbvisit replication est apparu en premier sur Blog dbi services.

↧

Nulls in composite keys

July 12, 2016, 4:36 am

≫ Next: Oracle Public Cloud patch conflict

≪ Previous: Compare source and target in a Dbvisit replication

Comparison of NULL can be misleading and it’s even worse for unique constraint validation. Having partial nulls in a composite key can be tricky because the SQL ANSI specification is not very easy to understand, and implementation can depend on the RDBMS. Here is an example with composite unique key and foreign key on Oracle.

Unique constraint

I create a table with a composite unique constraint:
SQL> create table TABLE1 (a char, b char, unique(a,b)); Table TABLE1 created.

I can insert a row with a=’X’ and b=’X':
SQL> insert into TABLE1 values ('X','X'); 1 row inserted.

I cannot insert the same row:
SQL> insert into TABLE1 values ('X','X'); * ERROR at line 1: ORA-00001: unique constraint (SYS.SYS_C0015464) violated

I insert another row with same value for column a but different value for column b:
SQL> insert into TABLE1 values ('X','Y'); 1 row inserted.

And another row with same value for column a but a null for column b:
SQL> insert into TABLE1 values ('X',null); 1 row inserted.

However, I cannot insert the same a second time:
SQL> insert into TABLE1 values ('X',null); * ERROR at line 1: ORA-00001: unique constraint (SYS.SYS_C0015464) violated

If you look at documentation, this is documented as:
Because of the search mechanism for unique key constraints on multiple columns, you cannot have identical values in the non-null columns of a partially null composite unique key constraint.

It looks like an implementation reason (the search mechanism is the index that enforces the unique constraint). What is documented in SQL-92?
A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns.

How to interpret this? We cannot insert two (‘X’,null) because that would be two rows with same non-null value (a=’X’) and the Oracle implementation is compilent.

Or is it? We can also read the definition as the unique constraint being violated only when we find rows that have non-null values and they are the same. This is what MySQL and PostgresSQL do: accept duplicates when there is at least one null.
This is also what I found more intuitive: I usually consider NULL as a value that is not known at insert time but that will be assigned a value later during the lifecycle of the row. Thus, I expect to be able to insert rows where there is a null and check the constraint only when all columns have a value.

It is probably an implementation choice from Oracle which stores nulls as a zero-length string and then cannot have two identical entries in a unique index.

Now inserting a row where a is null and b is null:
SQL> insert into TABLE1 values (null,null); 1 row inserted.

And because that do not violate the rule whatever the way we read it (non-null values are not the same as there are no non-null values at all here) I can insert a second one:
SQL> insert into TABLE1 values (null,null); 1 row inserted.

This is documented as
Unless a NOT NULL constraint is also defined, a null always satisfies a unique key constraint

About implementation, there is no problem because full null entries are not stored in the index. They are stored in bitmap indexes, but bitmap indexes cannot be used to enforce a unique constraint.

In summary, here is what can be stored on a table where (a,b) is unique but nullable:

SQL> select rownum,TABLE1.* from TABLE1; ROWNUM A B ---------- - - 1 X X 2 X Y 3 X 4 5

Foreign key

Now that I have a unique key, I can reference it:
SQL> create table TABLE2 (a char, b char, foreign key(a,b) references TABLE1(a,b)); Table TABLE2 created.

Yes. You don’t need to reference the primary key. Any unique key, even with nullable columns, can be referenced.

I can insert a row where parent exists:
SQL> insert into TABLE2 values('X','X'); 1 row inserted.

As I’ve no unique key on the child, it’s many to one relationship:
SQL> insert into TABLE2 values('X','X'); 1 row inserted.

I also have a parent with a=’X’ and b=’Y':
SQL> insert into TABLE2 values('X','Y'); 1 row inserted.

But I’ve no parent with a=’Y':
SQL> insert into TABLE2 values('Y','Y'); * ERROR at line 1: ORA-02291: integrity constraint (SYS.SYS_C0015465) violated - parent key not found

So far so good. I said that I have a many to one relationship, but it’s a many to one or zero because my columns are nullable:
SQL> insert into TABLE2 values(null,null); 1 row inserted.

So far so good. But I have a composite key with nullable columns here, and I can insert a row where a=’X’ and b is null:
SQL> insert into TABLE2 values('X',null); 1 row inserted.

But do you think that all non null parent values must exist?
SQL> insert into TABLE2 values('Y',null); 1 row inserted.

Once again, this is documented as:
If any column of a composite foreign key is null, then the non-null portions of the key do not have to match any corresponding portion of a parent key.

And this is what is specified in SQL-92:
If no <match type> was specified then, for each row R1 of the referencing table, either at least one of the values of the referencing columns in R1 shall be a null value, or the value of each referencing column in R1 shall be equal to the value of the corresponding referenced column in some row of the referenced table. More detail about the other match types in Oracle Development Guide.

That may look strange, but, still thinking about NULLS as unknown values, you can consider that constraints cannot be validated until we know all values.

Here is what I was able to insert into my table even with no a=’Y’ in the parent:

SQL> select rownum,TABLE2.* from TABLE2; ROWNUM A B ---------- - - 1 X X 2 X X 3 X Y 4 X 5 6 Y

So what?

Having nulls in composite unique key or foreign key can be misleading, then it’s better to ensure that what you define fits what you expect. It’s probably better to prevent partial nulls in foreign key (a check constraint can ensure that if one column is null then all columns must be null) or to have and additional referential integrity constraint which ensures that you can set only the allowed values for a subset of columns (in our case, a table with column a as primary key that we can reference).

Cet article Nulls in composite keys est apparu en premier sur Blog dbi services.

↧

Oracle Public Cloud patch conflict

July 13, 2016, 11:20 am

≫ Next: AMM and ASMM derived parameters

≪ Previous: Nulls in composite keys

This morning I wanted to test a patch (18633374) in the Oracle Cloud Service. The DBaaS was created as an ‘Enterprise Edition Extreme Performance’ which comes with all options, including multitenant option. I applied my patch. My test required to create a new tablespace but it failed with: ORA-65010: maximum number of pluggable databases created

This is the kind of message we get when we try to use a feature that is not allowed in Standard Edition. But I was in Enterprise Edition here:
First thing I did was to tweet a screenshot, in case someone encountered the issue already:

This is not exactly my definition of 'EE Extreme Perf'. WTF did I do? pic.twitter.com/y34JoX9mY3

— Franck Pachot (@FranckPachot) July 13, 2016

And second thing was to try to reproduce the issue because it’s a test environment where I did things quickly and I don’t remember all what was done.
I create a new service in EE Extreme Performance:
Connected to: Oracle Database 12c EE Extreme Perf Release 12.1.0.2.0 - 64bit Production With the Partitioning, Oracle Label Security, OLAP, Advanced Analytics and Real Application Testing options
check that I can create additional pluggable databases
SQL> show pdbs CON_ID CON_NAME OPEN MODE RESTRICTED ---------- ------------------------------ ---------- ---------- 2 PDB$SEED READ ONLY NO 3 PDB READ WRITE NO 4 DEMOS READ WRITE NO SQL> create pluggable database PDBNEW admin user admin identified by admin; Pluggable database created. SQL> create pluggable database PDBNEW1 admin user admin identified by admin; Pluggable database created.

I tried to do the same as I did (apply patch 18633374)

[oracle@CDBA 18633374]$ dbshut $ORACLE_HOME Processing Database instance "CDB": log file /u01/app/oracle/product/12.1.0/dbhome_1/shutdown.log $ORACLE_HOME/OPatch/opatch apply [oracle@CDBA 18633374]$ $ORACLE_HOME/OPatch/opatch apply Oracle Interim Patch Installer version 12.1.0.1.10 Copyright (c) 2016, Oracle Corporation. All rights reserved. Oracle Home : /u01/app/oracle/product/12.1.0/dbhome_1 Central Inventory : /u01/app/oraInventory from : /u01/app/oracle/product/12.1.0/dbhome_1/oraInst.loc OPatch version : 12.1.0.1.10 OUI version : 12.1.0.2.0 Log file location : /u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/opatch/18633374_Jul_13_2016_11_23_28/apply2016-07-13_11-23-28AM_1.log Verifying environment and performing prerequisite checks... Conflicts/Supersets for each patch are: Patch : 18633374 Conflict with 23192060 Conflict details: /u01/app/oracle/product/12.1.0/dbhome_1/lib/libserver12.a:kpdbc.o /u01/app/oracle/product/12.1.0/dbhome_1/lib/libserver12.a:krb.o /u01/app/oracle/product/12.1.0/dbhome_1/lib/libserver12.a:krbb.o /u01/app/oracle/product/12.1.0/dbhome_1/lib/libserver12.a:krbi.o /u01/app/oracle/product/12.1.0/dbhome_1/lib/libserver12.a:krbabr.o Following patches have conflicts: [ 18633374 23192060 ]

Yes, I remember that I had to de-install an interim patch that was there in my newly created DBaaS:

[oracle@CDBA 18633374]$ $ORACLE_HOME/OPatch/opatch lspatches 23192060; 22674709;Database PSU 12.1.0.2.160419, Oracle JavaVM Component (Apr2016) 22291127;Database Patch Set Update : 12.1.0.2.160419 (22291127)

Let’s do it:

[oracle@CDBA 18633374]$ $ORACLE_HOME/OPatch/opatch rollback -id 23192060 Oracle Interim Patch Installer version 12.1.0.1.10 Copyright (c) 2016, Oracle Corporation. All rights reserved. Oracle Home : /u01/app/oracle/product/12.1.0/dbhome_1 Central Inventory : /u01/app/oraInventory from : /u01/app/oracle/product/12.1.0/dbhome_1/oraInst.loc OPatch version : 12.1.0.1.10 OUI version : 12.1.0.2.0 Log file location : /u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/opatch/23192060_Jul_13_2016_11_24_49/rollback2016-07-13_11-24-49AM_1.log RollbackSession rolling back interim patch '23192060' from OH '/u01/app/oracle/product/12.1.0/dbhome_1' Please shutdown Oracle instances running out of this ORACLE_HOME on the local system. (Oracle Home = '/u01/app/oracle/product/12.1.0/dbhome_1') Is the local system ready for patching? [y|n] y User Responded with: Y Patching component oracle.oracore.rsf, 12.1.0.2.0... Patching component oracle.rdbms.rsf, 12.1.0.2.0... Patching component oracle.rdbms, 12.1.0.2.0... Deleting "kscs.o" from archive "/u01/app/oracle/product/12.1.0/dbhome_1/lib/libserver12.a" Deleting "kststqae.o" from archive "/u01/app/oracle/product/12.1.0/dbhome_1/lib/libserver12.a" Patching component oracle.rdbms.dbscripts, 12.1.0.2.0... RollbackSession removing interim patch '23192060' from inventory Log file location: /u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/opatch/23192060_Jul_13_2016_11_24_49/rollback2016-07-13_11-24-49AM_1.log OPatch succeeded.

Then I check if I’m still able to create a new PDB:

[oracle@CDBA 18633374]$ dbstart $ORACLE_HOME Processing Database instance "CDB": log file /u01/app/oracle/product/12.1.0/dbhome_1/startup.log

Ok, I’ll not detail the following problem for the moment:
ORA-01078: failure in processing system parameters LRM-00101: unknown parameter name 'encrypt_new_tablespaces'

During my tests I removed that encrypt_new_tablespaces parameter from the spfile to continue.

[oracle@CDBA 18633374]$ sqlplus / as sysdba SQL*Plus: Release 12.1.0.2.0 Production on Wed Jul 13 11:34:57 2016 Copyright (c) 1982, 2014, Oracle. All rights reserved. Connected to: Oracle Database 12c EE Extreme Perf Release 12.1.0.2.0 - 64bit Production SQL> create pluggable database PDBNEW2 admin user admin identified by admin; create pluggable database PDBNEW2 admin user admin identified by admin * ERROR at line 1: ORA-65010: maximum number of pluggable databases created

Ok, issue reproduced. Interim patch 23192060 is required to be able to have EE Extreme Perf able to act as an Enterprise Edition.

No description, but from alert.log the list of bugs bundled there:
Patch Id: 23192060 Patch Description: Patch Apply Time: Bugs Fixed: 19366375,19665921,19770063,21281607,21470120,21923026,23072137

A lookup in MOS gives:

19366375 – CORRECT ELAPSED TIME CALCULATION AND ADD DIAGNOSTIC FOR BUG 18920073
19665921 – ENABLE HCC FOR DBCS REGARDLESS OF EXTREME PERFORMANCE OPTION OR OTHER OPTIONS
19770063 – GET INFO ABOUT CLOUD BUNDLE INTO V$INSTANCE TABLE
21281607 – TRANSPARENTLY ENCRYPT TABLESPACE AT CREATION IN CLOUD
21470120 – CLOUD BACKPORT FOR HCC AND VERSION CHANGES
21923026 – ORA-600 [OLTP COMPRESSION SANITY CHECK] 23072137 – TDE OFFLINE ENCRYPTION SHOULD NOT BE ALLOWED CONCURRENTLY DURING MRP

Several Oracle Public Cloud specifics here. The ‘encrypt_new_tablespaces’ to do TDE for all new user tablespace, the HCC which is possible in any EE on the Cloud, and info about cloud edition in v$instance…

Let’s check the edition now that I’ve de-installed the patch 19770063:

SQL> select edition from v$instance; EDITION ------- UNKNOWN

This value comes from x$ksuxsinst.ksuxsedition which is 0 there and the view knows only the value 8 for Enterprise Edition:

SQL> select ksuxsedition from x$ksuxsinst; KSUXSEDITION ------------ 0 SQL> select view_definition from v$fixed_view_definition where view_name='GV$INSTANCE'; VIEW_DEFINITION --------------- select ks.inst_id,ksuxsins,ksuxssid,ksuxshst,ksuxsver,ksuxstim,decode(ksuxssts,0,'STARTED',1,'MOUNTED',2,'OPEN',3,'OPEN MIGRATE','UNKNOWN'),decode(ksuxsshr,0,'NO',1,'YES',2,NULL),ksuxsthr,decode(ksuxsarc,0,'STOPPED',1,'STARTED','FAILED'),decode(ksuxslsw,0,NULL,2,'ARCHIVE LOG',3,'CLEAR LOG',4,'CHECKPOINT', 5,'REDO GENERATION'),decode(ksuxsdba,0,'ALLOWED','RESTRICTED'),decode(ksuxsshp,0,'NO','YES'),decode(kvitval,0,'ACTIVE',2147483647,'SUSPENDED','INSTANCE RECOVERY'),decode(ksuxsrol,1,'PRIMARY_INSTANCE',2,'SECONDARY_INSTANCE','UNKNOWN'), decode(qui_state,0,'NORMAL',1,'QUIESCING',2,'QUIESCED','UNKNOWN'), decode(bitand(ksuxsdst, 1), 0, 'NO', 1, 'YES', 'NO'), ks.con_id, decode(ksuxsmode,2,'READ MOSTLY','REGULAR'), decode(ksuxsedition, 2, 'PO', 4, 'SE', 8, 'EE', 16, 'XE', 32, 'CS', 40, 'CE', 'UNKNOWN'), ksuxsfam from x$ksuxsinst ks, x$kvit kv, x$quiesce qu where kvittag = 'kcbwst'

No doubt, this meay lead to inaccesible EE features.

When you create a DBaaS in, wich includes the patch 2319206, you get the following for EE Extreme Performance:

SQL> host $ORACLE_HOME/OPatch/opatch lspatches 23192060; 22674709;Database PSU 12.1.0.2.160419, Oracle JavaVM Component (Apr2016) 22291127;Database Patch Set Update : 12.1.0.2.160419 (22291127) SQL> select edition from v$instance; EDITION ------- XP SQL> select ksuxsedition from x$ksuxsinst; KSUXSEDITION ------------ 256 SQL> select view_definition from v$fixed_view_definition where view_name='GV$INSTANCE'; VIEW_DEFINITION --------------- select ks.inst_id,ksuxsins,ksuxssid,ksuxshst,ksuxsver,ksuxstim,decode(ksuxssts,0,'STARTED',1,'MOUNTED',2,'OPEN',3,'OPEN MIGRATE','UNKNOWN'),decode(ksuxsshr,0,'NO',1,'YES',2,NULL),ksuxsthr,decode(ksuxsarc,0,'STOPPED',1,'STARTED','FAILED'),decode(ksuxslsw,0,NULL,2,'ARCHIVE LOG',3,'CLEAR LOG',4,'CHECKPOINT', 5,'REDO GENERATION'),decode(ksuxsdba,0,'ALLOWED','RESTRICTED'),decode(ksuxsshp,0,'NO','YES'),decode(kvitval,0,'ACTIVE',2147483647,'SUSPENDED','INSTANCE RECOVERY'),decode(ksuxsrol,1,'PRIMARY_INSTANCE',2,'SECONDARY_INSTANCE','UNKNOWN'), decode(qui_state,0,'NORMAL',1,'QUIESCING',2,'QUIESCED','UNKNOWN'), decode(bitand(ksuxsdst, 1), 0, 'NO', 1, 'YES', 'NO'), ks.con_id, decode(ksuxsmode,1,'REGULAR',2,'READ MOSTLY','READ ONLY'), decode(ksuxsedition, 2, 'PO', 4, 'SE', 8, 'EE', 16, 'XE', 32, 'CS', 64, 'CE', 128, 'HP', 256, 'XP', 'UNKNOWN'), ksuxsfam, kjidtv from x$ksuxsinst ks, x$kvit kv, x$quiesce qu, x$kjidt where kvittag = 'kcbwst'

So what?

The Oracle Public Cloud is a strange PaaS: database is provisioned automatically but you can break everything you want later: you are DBA, SYSDBA and even root, as in IaaS. But it’s not because you can do everything that you should do everything. The Oracle Database software has been adapted for the Cloud and requires specific patches. After each PSU, those patches are merged to be applied over the PSU. And if you need to apply a new patch which conflicts with one of them, then you should request a merge that includes the Cloud fixes.

Having different patches for Cloud and for on-premises is not very nice. If the goal is to have dev and test in the public cloud and prod on-premises, then we want the same software and the same patching procedures. But don’t worry, this is because the cloud arrived after 12.1.0.2 release. Next generation will be stabilized on the cloud first. We complains about ‘cloud-first’?

Cet article Oracle Public Cloud patch conflict est apparu en premier sur Blog dbi services.

↧

AMM and ASMM derived parameters

July 16, 2016, 1:24 pm

≫ Next: Redo log block size on ODA X6 all flash

≪ Previous: Oracle Public Cloud patch conflict

The latest DBA Essentials Workshop training I’ve given raised a question about PGA_AGGREGATE_LIMIT. The default depends on PGA_AGGREGATE_TARGET. So how is it calculated in AMM where PGA_AGGREGATE_TARGET is dynamic? Is it also dynamic or is it determined by the value at instance startup only?

The PGA_AGGREGATE_LIMIT default value is documented. I’ll use the following query to display the values of the concerned parameters:
select dbms_stats_internal.get_parameter_val('pga_aggregate_limit')/1024/1024/1024 "pga_aggregate_limit", 2 "2GB", 3*dbms_stats_internal.get_parameter_val('processes')/1024 "3MB*processes", 2*dbms_stats_internal.get_parameter_val('__pga_aggregate_target')/1024/1024/1024 "2*__pga_aggregate_target", dbms_stats_internal.get_parameter_val('__sga_target')/1024/1024/1024 "__sga_target", dbms_stats_internal.get_parameter_val('__pga_aggregate_target')/1024/1024/1024 "__pga_aggregate_target" from dual /
and I start with the following:
pga_aggregate_limit 2GB 3MB*processes 2*__pga_aggregate_target __sga_target __pga_aggregate_target ------------------- ---------- ------------- ------------------------ ------------ ---------------------- 2.40625 2 .87890625 2.40625 1.796875 1.203125

I’m in AMM with only MEMORY_TARGET set to 3G. The dynamic SGA is at 1.8G and the PGA at 1.2G. The PGA_AGGREGATE_LIMIT is at 200% of the PGA which is 2.4G

I increase the SGA in order to see a resize of the PGA

SQL> alter system set sga_target=2500M; System altered.

The PGA is now about 500M in order to release some space for SGA:

pga_aggregate_limit 2GB 3MB*processes 2*__pga_aggregate_target __sga_target __pga_aggregate_target ------------------- ---------- ------------- ------------------------ ------------ ---------------------- 2.40625 2 .87890625 1.09375 1.796875 .546875

However, the PGA_AGGREGATE_LIMIT did no change. the formula is not dynamic. The value that has been calculated at startup remains.

spfile

When dynamic components are resized, the values are written into the spfile with double underscore parameters, so that a restart of the instance starts with same value:
SQL> host strings /u01/app/oracle/product/12.1.0/dbhome_1/dbs/spfileCDB.ora | grep target CDB.__pga_aggregate_target=587202560 CDB.__sga_target=1929379840 *.memory_max_target=5G *.memory_target=3G .sga_target=2634022912

So let’s restart and see what happens to PGA_AGGREGATE_LIMIT (which has no double underscore entry in spfile)

SQL> startup force ORACLE instance started. Total System Global Area 5368709120 bytes Fixed Size 2935712 bytes Variable Size 3976201312 bytes Database Buffers 721420288 bytes Redo Buffers 13840384 bytes In-Memory Area 654311424 bytes Database mounted. Database opened. SQL> select 2 dbms_stats_internal.get_parameter_val('pga_aggregate_limit')/1024/1024/1024 "pga_aggregate_limit", 3 2 "2GB", 4 3*dbms_stats_internal.get_parameter_val('processes')/1024 "3MB*processes", 5 2*dbms_stats_internal.get_parameter_val('__pga_aggregate_target')/1024/1024/1024 "2*__pga_aggregate_target", 6 dbms_stats_internal.get_parameter_val('__sga_target')/1024/1024/1024 "__sga_target", 7 dbms_stats_internal.get_parameter_val('__pga_aggregate_target')/1024/1024/1024 "__pga_aggregate_target" 8 from dual 9 / pga_aggregate_limit 2GB 3MB*processes 2*__pga_aggregate_target __sga_target __pga_aggregate_target ------------------- ---------- ------------- ------------------------ ------------ ---------------------- 2 2 .87890625 1.09375 2.453125 .546875

The good thing is that the value is calculated from the actual values. Here 200% of PGA is smaller than 2G so 2G is used.

The bad thing is that a restart of the instance may bring a different behavior than before than restart.

So what?

This instability is easy to solve: don’t use AMM. SGA and PGA are different things and you should size them separately.
But the problem is wider. There are other parameters that can show same behavior. For example, the default db_file_multiblock_read_count can be limited by processes x __db_block_buffers.
You may have to change some values either manually or automatically at the start of a new application because you don’t know which is the best setting. But once the application is more stable, you should stabilize the dynamic sizing by setting minimum values.

Cet article AMM and ASMM derived parameters est apparu en premier sur Blog dbi services.

↧

Redo log block size on ODA X6 all flash

July 22, 2016, 2:43 pm

≫ Next: Large Pages and MEMORY_TARGET on Windows

≪ Previous: AMM and ASMM derived parameters

On the Oracle Database Appliance, the redo logs are on Flash storage (and with X6 everything is on Flash storage) so you may wonder if we can benefit from 4k redo blocksize. Here are some tests about it on an ODA X6-2M.

I’ll compare the same workload (heavy inserts) with 512 bytes and 4k bytes block size redo. However, we can’t create a log group different than 512 bytes:
ORA-01378: The logical block size (4096) of file /u03/app/oracle/redo/LABDB1/onlinelog/o1_mf_999_%u_.log is not compatible with the disk sector size (media sector size is 512 and host sector size is 512)

This is because the flash storage is exposed with 512 bytes sector size:

ASMCMD> lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 4096 4194304 4894016 4500068 2441888 1023992 0 Y DATA/ MOUNTED NORMAL N 512 4096 4194304 1231176 221172 610468 -199762 0 N RECO/

Then, in order to be able to create new redo log groups with higher block size you need to set “_disk_sector_size_override” to TRUE;

I have 3 log groups with 512 bytes block size, and 3 groups with 4k:

LOGFILE GROUP 10 '+RECO/LABDB1/ONLINELOG/group_10.264.917867333' SIZE 51200M BLOCKSIZE 512, GROUP 11 '+RECO/LABDB1/ONLINELOG/group_11.265.917867489' SIZE 51200M BLOCKSIZE 512, GROUP 12 '+RECO/LABDB1/ONLINELOG/group_12.266.917867645' SIZE 51200M BLOCKSIZE 512, GROUP 13 '+RECO/LABDB1/ONLINELOG/group_13.267.917867795' SIZE 51200M BLOCKSIZE 4096, GROUP 14 '+RECO/LABDB1/ONLINELOG/group_14.268.917867913' SIZE 51200M BLOCKSIZE 4096, GROUP 15 '+RECO/LABDB1/ONLINELOG/group_15.269.917868013' SIZE 51200M BLOCKSIZE 4096

In 12c the database files should be on ACFS and not directly on the diskgroup. We did this on purpose in order to check if there is any overhead when in ACFS and we have seen exactly the same performance in both. There is something I dislike here however: redo log files are not multiplexed with multiple log members, but rely on the diskgroup redundancy. I agree with that in ASM because you are not supposed to manage the files and then risk to delete one of them. But in ACFS you see only one file, and if you drop it by mistake, both mirrors are lost, with the latest transactions.

On an insert intensive workload I take AWR snapshots between two log switches:

The switch between blocksize 512 and blocksize 4096 happened at 12:35

Don’t be nervous about those orange ‘log file sync waits’ we had to run 10000 transactions per second in order to get some contention here.

We have to go to the details in order to compare, from an AWR Diff report:

Workload Comparison ~~~~~~~~~~~~~~~~~~~ DB time: CPU time: Background CPU time: Redo size (bytes): Logical read (blocks): Block changes: Physical read (blocks): Physical write (blocks): Read IO requests: Write IO requests: Read IO (MB): Write IO (MB): IM scan rows: Session Logical Read IM: User calls: Parses (SQL): Hard parses (SQL): SQL Work Area (MB): Logons: Executes (SQL): Transactions: 1st Per Sec 2nd Per Sec %Diff
--------------- --------------- ------
37.9 37.3 -1.4
19.0 24.4 28.4
0.8 1.0 23.2
61,829,138.5 76,420,493.9 23.6
1,181,178.7 1,458,915.9 23.5
360,883.0 445,770.8 23.5
0.4 1.1 164.3
14,451.2 16,092.4 11.4
0.4 1.1 164.3
9,829.4 10,352.3 5.3
0.0 0.0 100.0
112.9 125.7 11.4
0.0 0.0 0.0
8,376.0 10,341.2 23.5
5,056.0 6,247.8 23.6
0.0 0.0 0.0
3.1 3.2 3.5
0.4 0.3 -37.2
225,554.2 278,329.3 23.4
10,911.0 13,486.4 23.6

The second workload, when redo blocksize was 4k, was able to handle 23% more activity.

‘log file sync’ average time is 1.3 milliseconds instead of 2.4:

Top Timed Events First DB/Inst: LABDB1/labdb1 Snaps: 155-156 (Elapsed time: 301.556 sec DB time: 11417.12 sec), Second DB/Inst: LABDB1/labdb1 Snaps: 157-158 (Elapsed time: 301.927 sec DB time: 11269.1 sec) -> Events with a "-" did not make the Top list in this set of snapshots, but are displayed for comparison purposes 1st 2nd ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Event Wait Class Waits Time(s) Avg Time(ms) %DB time Event Wait Class Waits Time(s) Avg Time(ms) %DB time ------------------------------ ------------- ------------ ------------ ------------- ----------- ------------------------------ ------------- ------------ ------------ ------------- ----------- CPU time N/A 5,722.8 N/A 50.1 CPU time N/A 7,358.4 N/A 65.3 log file sync Commit 2,288,655 5,412.1 2.4 47.4 log file sync Commit 2,808,036 3,535.5 1.3 31.4 target log write size Other 363,206 283.7 0.8 2.5 target log write size Other 644,287 278.2 0.4 2.5 log file parallel write System I/O 368,063 225.1 0.6 2.0 enq: TX - row lock contention Application 171,485 170.2 1.0 1.5 db file parallel write System I/O 12,399 160.2 12.9 1.4 db file parallel write System I/O 12,131 150.4 12.4 1.3 enq: TX - row lock contention Application 144,822 133.2 0.9 1.2 log file parallel write System I/O 649,501 148.1 0.2 1.3 library cache: mutex X Concurrency 130,800 120.8 0.9 1.1 library cache: mutex X Concurrency 86,632 128.1 1.5 1.1 log file sequential read System I/O 7,433 27.5 3.7 0.2 LGWR wait for redo copy Other 478,350 45.1 0.1 0.4 LGWR wait for redo copy Other 228,643 20.8 0.1 0.2 log file sequential read System I/O 6,577 21.7 3.3 0.2 buffer busy waits Concurrency 261,348 15.8 0.1 0.1 buffer busy waits Concurrency 295,880 20.1 0.1 0.2 --------------------------------------------------------------------------------------------------------------------

We see that this difference comes from lower latency in ‘log file parallel write':

Wait Events First DB/Inst: LABDB1/labdb1 Snaps: 155-156 (Elapsed time: 301.556 sec DB time: 11417.12 sec), Second DB/Inst: LABDB1/labdb1 Snaps: 157-158 (Elapsed time: 301.927 sec DB time: 11269.1 sec) -> Ordered by absolute value of 'Diff' column of '% of DB time' descending (idle events last) # Waits/sec (Elapsed Time) Total Wait Time (sec) Avg Wait Time (ms) ---------------------------------------- ---------------------------------------- ------------------------------------------- Event Wait Class 1st 2nd %Diff 1st 2nd %Diff 1st 2nd %Diff ------------------------------ ------------- -------------- -------------- ---------- -------------- -------------- ---------- --------------- --------------- ----------- log file sync Commit 7,589.5 9,300.4 22.5 5,412.1 3,535.5 -34.7 2.36 1.26 -46.61 log file parallel write System I/O 1,220.5 2,151.2 76.2 225.1 148.1 -34.2 0.61 0.23 -62.30 enq: TX - row lock contention Application 480.2 568.0 18.3 133.2 170.2 27.8 0.92 0.99 7.61 LGWR wait for redo copy Other 758.2 1,584.3 109.0 20.8 45.1 117.1 0.09 0.09 0.00 library cache: mutex X Concurrency 433.8 286.9 -33.8 120.8 128.1 6.0 0.92 1.48 60.87 db file parallel write System I/O 41.1 40.2 -2.3 160.2 150.4 -6.2 12.92 12.40 -4.02 cursor: pin S Concurrency 29.7 46.0 55.0 9.9 16.6 67.0 1.11 1.19 7.21 cursor: mutex X Concurrency 7.0 10.8 54.2 13.6 19.7 45.0 6.39 6.01 -5.95 latch: In memory undo latch Concurrency 585.3 749.0 28.0 10.8 16.3 50.8 0.06 0.07 16.67

In order to go into details, here is the wait event histogram for 512 bytes redo blocksize:

% of Waits ----------------------------------------------- Total Event Waits <1ms <2ms <4ms <8ms <16ms <32ms 1s ------------------------- ------ ----- ----- ----- ----- ----- ----- ----- ----- LGWR all worker groups 41 48.8 12.2 14.6 14.6 4.9 2.4 2.4 LGWR any worker group 259 6.2 5.4 8.9 13.9 18.1 18.1 29.3 LGWR wait for redo copy 228.9K 99.1 .9 .0 LGWR worker group orderin 442 44.6 9.7 4.5 5.0 9.3 10.6 16.3 log file parallel write 368.5K 85.3 7.5 4.7 1.4 .9 .2 .0 log file sequential read 7358 6.5 13.1 59.0 17.2 3.0 1.1 .2 log file sync 2.3M 48.9 23.1 17.0 5.7 2.7 2.3 .3

and for 4096 bytes blocksize:

% of Waits ----------------------------------------------- Total Event Waits <1ms <2ms <4ms <8ms <16ms <32ms 1s ------------------------- ------ ----- ----- ----- ----- ----- ----- ----- ----- LGWR all worker groups 20 45.0 5.0 15.0 10.0 5.0 20.0 LGWR any worker group 235 7.2 3.0 5.5 7.7 14.5 25.1 37.0 LGWR wait for redo copy 478.7K 98.9 1.0 .1 .0 LGWR worker group orderin 517 51.3 9.7 2.3 2.9 7.2 11.6 15.1 log file parallel write 649.9K 97.7 1.3 .3 .3 .4 .0 .0 log file sequential read 6464 5.7 8.2 73.5 11.0 1.2 .3 .1 log file sync 2.8M 78.2 15.6 2.3 .8 1.6 1.2 .

Few milliseconds are not perceived by end-user at commit except if the application has a design that is so bad that hundreds of commits are done for each user interaction. Even if both are really fast, the log writers was above 1ms for writes only for 1% of them when in blocksize 4k vs. 15% with default blocksize.

This faster latency is measured by I/O statistics as well:

Reads: Reqs Data Writes: Reqs Data Waits: Avg Function Name Data per sec per sec Data per sec per sec Count Tm(ms) --------------- ------- ------- ------- ------- ------- ------- ------- -------
BLOCKSIZE 512:
LGWR 0M 0.0 0M 18.1G 2420.4 61.528M 368.9K 0.6
BLOCKSIZE 4096:
LGWR 0M 0.0 0M 24.1G 4263.5 81.689M 649.5K 0.2

To be comprehensive, here are the statistics related with redo, thanks to those many statistics available in 12c:

Value per Second (Elapsed Time) ------------------------------------------- --------------------------------------- Statistic 1st 2nd %Diff 1st 2nd %Diff ------------------------------ ---------------- ---------------- --------- -------------- -------------- --------- redo KB read 16,319,609 15,783,576 -3.3 54,118.0 52,276.1 -3.4 redo blocks checksummed by FG 26,587,090 1,000,267 -96.2 88,166.3 3,312.9 -96.2 redo blocks written 37,974,499 6,318,372 -83.4 125,928.5 20,926.8 -83.4 redo blocks written (group 0) 37,256,502 6,257,861 -83.2 123,547.5 20,726.4 -83.2 redo blocks written (group 1) 717,997 60,511 -91.6 2,381.0 200.4 -91.6 redo entries 24,023,503 30,214,386 25.8 79,665.1 100,071.8 25.6 redo size 18,644,947,688 23,073,410,468 23.8 61,829,138.5 76,420,493.9 23.6 redo synch long waits 343 4,890 1,325.7 1.1 16.2 1,321.1 redo synch time 541,804 354,625 -34.5 1,796.7 1,174.5 -34.6 redo synch time (usec) 5,418,056,862 3,546,209,390 -34.5 17,967,000.7 11,745,254.3 -34.6 redo synch time overhead (usec) 145,664,759 197,925,281 35.9 483,043.8 655,540.2 35.7 redo synch time overhead count ( 2ms) 2,295,847 2,821,726 22.9 7,613.3 9,345.7 22.8 redo synch time overhead count ( 8ms) 443 3,704 736.1 1.5 12.3 734.7 redo synch time overhead count ( 32ms) 2 9 350.0 0.0 0.0 200.0 redo synch writes 2,305,502 2,849,645 23.6 7,645.4 9,438.2 23.5 redo wastage 179,073,264 2,703,864,280 1,409.9 593,830.9 8,955,357.7 1,408.1 redo write finish time 291,094,266 277,884,591 -4.5 965,307.5 920,370.1 -4.7 redo write gather time 63,237,013 125,066,420 97.8 209,702.4 414,227.3 97.5 redo write info find 2,296,292 2,825,439 23.0 7,614.8 9,358.0 22.9 redo write schedule time 63,679,682 125,819,850 97.6 211,170.3 416,722.8 97.3 redo write size count ( 4KB) 12,220 0 40.5 0 redo write size count ( 8KB) 26,420 2,246 -91.5 87.6 7.4 -91.5 redo write size count ( 16KB) 69,674 94,557 35.7 231.0 313.2 35.5 redo write size count ( 32KB) 108,676 268,794 147.3 360.4 890.3 147.0 redo write size count ( 128KB) 106,651 253,669 137.8 353.7 840.2 137.6 redo write size count ( 256KB) 37,332 28,076 -24.8 123.8 93.0 -24.9 redo write size count ( 512KB) 7,328 2,382 -67.5 24.3 7.9 -67.5 redo write size count (1024KB) 28 28 0.0 0.1 0.1 0.0 redo write time 29,126 27,817 -4.5 96.6 92.1 -4.6 redo write time (usec) 291,261,420 278,162,437 -4.5 965,861.8 921,290.4 -4.6 redo write total time 306,213,383 298,786,696 -2.4 1,015,444.5 989,599.1 -2.5 redo write worker delay (usec) 38,246,633 73,452,748 92.1 126,830.9 243,279.8 91.8 redo writes 368,330 649,751 76.4 1,221.4 2,152.0 76.2 redo writes (group 0) 366,492 648,430 76.9 1,215.3 2,147.6 76.7 redo writes (group 1) 1,838 1,321 -28.1 6.1 4.4 -28.2 redo writes adaptive all 368,330 649,752 76.4 1,221.4 2,152.0 76.2 redo writes adaptive worker 368,330 649,752 76.4 1,221.4 2,152.0 76.2

I’ve added a few things that were masked by the AWR Diff Report. The writes lower than 4k is zero in the second snapshots because it’s the blocksize.
It’s interesting to see that redo size is higher and this is because you write 4k even when you have less. This is measured by ‘redo wastage’.

So, larger blocksize lowers the latency but increases the volume. Here, where NVMe optimizes the bandwidth to Flash storage, it may not be a problem.

So what?

You have to keep in mind that this workload, with lot of small transactions and no other waits, is a special workload for this test. If you are not in this extreme case, then the default block size is probably sufficient for latency and reduces the redo size. However, if log file sync latency is your bottleneck, you may consider increasing the blocksize.

Thanks to

Oracle Authorized Solution Center, Switzerland.
Those tests were done on ODA X6-2M at Arrow OASC. Arrow has a wide range of Engineered Systems available for Oracle partners, like dbi services, and for customers to do Proof of Concepts, demos, learning, benchmarks, etc.

Cet article Redo log block size on ODA X6 all flash est apparu en premier sur Blog dbi services.

↧

Large Pages and MEMORY_TARGET on Windows

July 27, 2016, 2:57 pm

≫ Next: Exadata X-5 Bare Metal vs. OVM performance

≪ Previous: Redo log block size on ODA X6 all flash

In a previous post about enabling large page on Windows Server, I explained that it’s possible to use large pages with Automatic Memory Management (AMM, where sizing if SGA and PGA is automatic with MEMORY_TARGET setting) but possible does not mean that it is recommended. We feel that it’s not a good idea, but are there reasons for it or it’s just our linux backgroud that brings this opinion?

SGA_MAX_SIZE=4G, MEMORY_MAX_SIZE=6G

What is the size of allocated large pages? If it is set, SGA_MAX_SIZE is the size allocated as large pages when ORA_LPENABLE=1, even if you’ve set some MEMORY_MAX_SIZE:

I take screenshots with all information: the init.ora settings and the regedit entry on top-left. Latest sqlplus commands on top right (here screenshot is just after the startup). And memory given by Sysinternals RamMap.

So, when SGA_MAX_SIZE is set, it’s the maximum size of SGA that can be allocated, and this is what is allocated at startup from physical memory until the end of the instance. You can reduce the SGA_TARGET dynamically, but I don’t see any reason for that as the memory allocated in large page will not be released, nor swapped, nor usable for PGA.

SGA_MAX_SIZE unset, MEMORY_MAX_SIZE=6G

If we don’t set SGA_MAX_SIZE, then the SGA can grow up to MEMORY_MAX_SIZE and this is what is allocated at startup when ORA_LPENABLE=1:

Physical memory must be free

When ORA_LPENABLE=1 and not enough memory is available if physical memory, you get an error (ORA-27102 OS 1450) at startup:

What I did here was running 10 instances of SQLDeveloper to use 2GB on my 8GB VM.

Fragmentation

Not only we need the SGA to be allocated from physical memory, but it needs to be contiguous. Here is a screenshot I took some times later with those sqldev closed but after I had lot of activity on the VM:

As you see here, there is enough RAM (7GB) but not contiguous.

The recommandation when using large pages on Windows is to start all instances immediately after server restart, and if you have to restart an instance you may have to reboot the server. Note that the major advantage of large pages is on virtualized environments, and then you should not have more that one instance in a server. If you are convinced that with Windows it’s always good to restart the server, here you have a reason: fragmentation breaks large pages.

Mixed-mode

Let’s go back to the state where I had 10 SQLDeveloper opened. I change ORA_LPENABLE to 2 instead of 1 to be able to start the instance even is there is not enough contiguous RAM for the SGA (here for the MEMORY_TARGET as no SGA_MAX_SIZE is defined).

Now, I’m able to start the instance (but it took several minutes here as physical memory is exhausted):

Instance is started, but not all 6GB have been allocated as large pages. In this case where SGA_TARGET is 3GB, I presume that all SGA uses large pages unless we increase SGA_TARGET higher than the 5GB allocated, but this is only my guess.

MEMORY_MAX_SIZE and ORA_LPENABLE=0

So, now that we know how it works, let’s see the danger to run AMM with large pages.

Here is a database in AMM where MEMORY_TARGET=5GB after running some query that needs several GB of buffer cache (thanks to CACHE hint and “_serial_direct_read”=never) and several GB of PGA (thanks to manual workarea size policy). First, without large pages:

The RAM allocation is all private process memory (on Windows, Oracle processes are actually threads from one process only). And AMM achieves its goal: target is MEMORY_TARGET=5GB and this is what we have allocated for the instance.

MEMORY_MAX_SIZE and ORA_LPENABLE=2

Now doing the same with large page:

MEMORY_MAX_SIZE is supposed to be used for PGA+SGA and a large use of PGA should reduce SGA. But it cannot be done here because SGA is large page and PGA cannot be allocated from large pages. This means that AMM with large page do not achieve its goal. MEMORY_MAX_SIZE acts like having set SGA_MAX_SIZE to that value and very small PGA_AGGREGATE_TARGET. In this example (artificial example as I used manual workarea size policy, but same happens with auto and several sessions) physical memory is exhausted.

From that, I think we have a reason to recommend ASSM (Automatic Shared Memory Management) with large pages, as with Linux. In Oracle, because of the threaded architecture, it’s not a requirement but in my opinion it’s still a good idea to differentiate those memory areas that are so different:

SGA: one area allocated at startup, preferentially from large pages
PGA: variable size areas allocated and de-allocated by sessions

If you have more than few GB on your server, you should size SGA and PGA independently and benefit from large pages for SGA. Do not set MEMORY_MAX_SIZE then. Don’t set MEMORY_TARGET either as it acts as MEMORY_MAX_SIZE if this one is not set.

Cet article Large Pages and MEMORY_TARGET on Windows est apparu en premier sur Blog dbi services.

↧

Exadata X-5 Bare Metal vs. OVM performance

July 29, 2016, 12:54 am

≫ Next: DBA essentials workshop with release Oracle 12c is Available

≪ Previous: Large Pages and MEMORY_TARGET on Windows

The Exadata X5 can be installed Bare Metal or Virtualized. The latter one, Oracle VM allows to create multiple clusters on one machine, is more complex for installation and for storage capacity planning. But it allows a small flexibility on options licencing. Those are the real challenges behind the choice. However, when we talk about virtualization, most of questions are about the overhead. Last week, we did some tests on same machine with different configuration, thanks to Arrow Oracle Authorized Solution Center.

Comparison is not easy. Bare Metal has all resources. Virtualized has to distribute resources. And this test is very specific: all I/O hitting the ‘extreme’ flash cache because I don’t expect any virtualization overhead to be in milliseconds. So, don’t expect some universal conclusions from those tests. And don’t hesitate to comment about my way to read those numbers.

CPU

Do not expect a benchmark that shows the maximum capacity of the machine here. I’m comparing a bare metal node with 36 cores with a VM with 4 vCPUS. So I’ll compare a one thread workload only: SLOB with one session and SCALE=100M UPDATE_PCT=0 RUN_TIME=120 WORK_UNIT=64

Bare Metal load profile:
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 30.5 0.00 2.91 DB CPU(s): 1.0 29.0 0.00 2.76 Background CPU(s): 0.0 0.2 0.00 0.00 Redo size (bytes): 14,172.4 432,594.0 Logical read (blocks): 810,244.4 24,731,696.3 Block changes: 41.7 1,271.3 Physical read (blocks): 111.6 3,407.8 Physical write (blocks): 0.0 0.3 Read IO requests: 111.3 3,397.3 Write IO requests: 0.0 0.3 Read IO (MB): 0.9 26.6 Write IO (MB): 0.0 0.0 Executes (SQL): 12,285.1 374,988.5

Virtualized load profile:
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 30.6 0.00 4.37 DB CPU(s): 1.0 29.8 0.00 4.26 Background CPU(s): 0.0 0.2 0.00 0.00 Redo size (bytes): 13,316.5 410,404.0 Logical read (blocks): 848,095.1 26,137,653.8 Block changes: 41.1 1,266.3 Physical read (blocks): 109.1 3,361.3 Physical write (blocks): 0.0 0.3 Read IO requests: 103.8 3,198.5 Write IO requests: 0.0 0.3 Read IO (MB): 0.9 26.3 Write IO (MB): 0.0 0.0 Executes (SQL): 13,051.2 402,228.0

We can say that CPU and RAM performance is similar.

I/O

Now about IOPS on the storage cell flash cache.
I’ll compare SLOB with one session and SCALE=100000M UPDATE_PCT=100 RUN_TIME=120 WORK_UNIT=64

Bare Metal load profile:
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 0.0 0.02 4.06 DB CPU(s): 0.1 0.0 0.00 0.49 Background CPU(s): 0.1 0.0 0.00 0.00 Redo size (bytes): 1,652,624.9 51,700.6 Logical read (blocks): 2,582.2 80.8 Block changes: 4,214.5 131.9 Physical read (blocks): 2,060.6 64.5 Physical write (blocks): 1,818.0 56.9 Read IO requests: 2,051.0 64.2 Write IO requests: 1,738.6 54.4 Read IO (MB): 16.1 0.5 Write IO (MB): 14.2 0.4 Executes (SQL): 66.3 2.1 Rollbacks: 0.0 0.0 Transactions: 32.0

Virtualized load profile:
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 0.0 0.01 3.49 DB CPU(s): 0.3 0.0 0.00 1.01 Background CPU(s): 0.2 0.0 0.00 0.00 Redo size (bytes): 2,796,963.3 51,713.3 Logical read (blocks): 4,226.0 78.1 Block changes: 7,107.0 131.4 Physical read (blocks): 3,470.6 64.2 Physical write (blocks): 3,278.7 60.6 Read IO requests: 3,462.0 64.0 Write IO requests: 3,132.0 57.9 Read IO (MB): 27.1 0.5 Write IO (MB): 25.6 0.5 Executes (SQL): 86.9 1.6 Rollbacks: 0.0 0.0 Transactions: 54.1

In two minutes we did more work here. Timed events show statistics about the ‘cell single block reads’ which are nothing else than ‘db file sequential read’ renamed to look more ‘Exadata’. No SmartScan happens here as they go to buffer cache and we cannot do any filtering for blocks that will be shared with other sessions.

Bare Metal:
Total Wait Wait % DB Wait Event Waits Time (sec) Avg(ms) time Class ------------------------------ ----------- ---------- ---------- ------ -------- cell single block physical rea 249,854 115.7 0.46 94.9 User I/O DB CPU 14.6 12.0

Virtualized: Total Wait Wait % DB Wait
Event Waits Time (sec) Avg(ms) time Class ------------------------------ ----------- ---------- ---------- ------ -------- cell single block physical rea 425,071 109.3 0.26 89.4 User I/O DB CPU 35.2 28.8

Lower latency here on average which explains why we did more work. But no conclusion before we know where the latency comes from. Averages hides the details, and it’s the same with the ‘IO Profile’ section:

Bare Metal
IO Profile Read+Write/Second Read/Second Write/Second ~~~~~~~~~~ ----------------- --------------- --------------- Total Requests: 3,826.6 2,055.1 1,771.5 Database Requests: 3,789.5 2,051.0 1,738.6 Optimized Requests: 3,720.7 1,985.1 1,735.6 Redo Requests: 32.5 0.0 32.5 Total (MB): 32.0 16.2 15.9 Database (MB): 30.3 16.1 14.2 Optimized Total (MB): 29.3 15.6 13.7 Redo (MB): 1.7 0.0 1.7 Database (blocks): 3,878.6 2,060.6 1,818.0 Via Buffer Cache (blocks): 3,878.6 2,060.6 1,818.0 Direct (blocks): 0.0 0.0 0.0

Virtualized
IO Profile Read+Write/Second Read/Second Write/Second ~~~~~~~~~~ ----------------- --------------- --------------- Total Requests: 6,652.2 3,467.0 3,185.2 Database Requests: 6,594.0 3,462.0 3,132.0 Optimized Requests: 6,582.7 3,461.2 3,121.5 Redo Requests: 54.7 0.0 54.7 Total (MB): 55.6 27.2 28.4 Database (MB): 52.7 27.1 25.6 Optimized Total (MB): 51.8 27.1 24.6 Redo (MB): 2.8 0.0 2.8 Database (blocks): 6,749.3 3,470.6 3,278.7 Via Buffer Cache (blocks): 6,749.3 3,470.6 3,278.7 Direct (blocks): 0.0 0.0 0.0

and for IO statistics.
Bare Metal:
Reads: Reqs Data Writes: Reqs Data Waits: Avg Function Name Data per sec per sec Data per sec per sec Count Tm(ms) --------------- ------- ------- ------- ------- ------- ------- ------- ------- Buffer Cache Re 1.9G 2050.9 16.093M 0M 0.0 0M 250.2K 0.5 DBWR 0M 0.0 0M 1.7G 1740.5 14.216M 0 N/A LGWR 0M 0.0 0M 201M 32.5 1.648M 3914 0.3 Others 8M 4.1 .066M 1M 0.5 .008M 560 0.0 TOTAL: 1.9G 2055.0 16.159M 1.9G 1773.4 15.872M 254.6K 0.5

Virtualized:
Reads: Reqs Data Writes: Reqs Data Waits: Avg Function Name Data per sec per sec Data per sec per sec Count Tm(ms) --------------- ------- ------- ------- ------- ------- ------- ------- ------- Buffer Cache Re 3.3G 3462.7 27.12M 0M 0.0 0M 425.6K 0.3 DBWR 0M 0.0 0M 3.1G 3133.9 25.639M 0 N/A LGWR 0M 0.0 0M 341M 54.7 2.775M 6665 0.3 Others 10M 5.0 .081M 1M 0.5 .008M 514 0.3 TOTAL: 3.3G 3467.7 27.202M 3.4G 3189.0 28.422M 432.7K 0.3

I’ve put the physical read statistics side-by-side to compare:

BARE METAL VIRTUALIZED Statistic Total per Trans Total per Trans -------------------------------- ------------------ ------------- ------------------ ------------- cell flash cache read hits 242,142 62.1 425,365 64.0 cell logical write IO requests 5,032 1.3 8,351 1.3 cell overwrites in flash cache 200,897 51.5 937,973 141.1 cell physical IO interconnect by 8,145,832,448 2,089,210.7 14,331,230,720 2,156,044.9 cell writes to flash cache 638,514 163.8 1,149,990 173.0 physical read IO requests 250,168 64.2 425,473 64.0 physical read bytes 2,059,042,816 528,095.1 3,494,084,608 525,663.4 physical read partial requests 4 0.0 0 0.0 physical read requests optimized 242,136 62.1 425,365 64.0 physical read total IO requests 250,671 64.3 426,089 64.1 physical read total bytes 2,067,243,008 530,198.3 3,504,136,192 527,175.6 physical read total bytes optimi 1,993,089,024 511,179.5 3,497,918,464 526,240.2 physical read total multi block 0 0.0 0 0.0 physical reads 251,348 64.5 426,524 64.2 physical reads cache 251,348 64.5 426,524 64.2 physical reads cache prefetch 1,180 0.3 1,051 0.2 physical reads direct 0 0.0 0 0.0 physical reads direct (lob) 0 0.0 0 0.0 physical reads prefetch warmup 1,165 0.3 1,016 0.2 physical write IO requests 212,061 54.4 384,909 57.9 physical write bytes 1,816,551,424 465,901.9 3,300,933,632 496,605.0 physical write requests optimize 211,699 54.3 383,624 57.7 physical write total IO requests 216,077 55.4 391,445 58.9 physical write total bytes 2,026,819,072 519,830.5 3,656,793,600 550,142.0 physical write total bytes optim 1,755,620,352 450,274.5 3,171,875,328 477,189.0 physical write total multi block 531 0.1 942 0.1 physical writes 221,747 56.9 402,946 60.6 physical writes direct 0 0.0 0 0.0 physical writes direct (lob) 0 0.0 0 0.0 physical writes from cache 221,747 56.9 402,946 60.6 physical writes non checkpoint 221,694 56.9 402,922 60.6

We already know that there were more work on the OVM run but comparing the ‘per transaction’ statistics show similar things but a bit more ‘flash cache’ ‘optimized’ I/O in the second run.
Of course, even if it’s the same machine, it has been re-imaged, database re-created, different volume and capacity. So maybe I hit more the cell flash on the second run than on the first one and more reads on spinning disks can explain the difference on single block reads latency.

We need to get beyond the averages with the wait event histograms. They don’t show lower than millisecond in the AWR report (I’ve opened an enhancement request for 12.2 about that) but I collected them from the V$EVENT_HISTOGRAM_MICRO

Bare Metal:
EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT ---------------------------------------- --------------- ---------- ------------------------------ cell single block physical read 1 0 1 microsecond cell single block physical read 2 0 2 microseconds cell single block physical read 4 0 4 microseconds cell single block physical read 8 0 8 microseconds cell single block physical read 16 0 16 microseconds cell single block physical read 32 0 32 microseconds cell single block physical read 64 0 64 microseconds cell single block physical read 128 533 128 microseconds cell single block physical read 256 240142 256 microseconds cell single block physical read 512 7818 512 microseconds cell single block physical read 1024 949 1 millisecond cell single block physical read 2048 491 2 milliseconds cell single block physical read 4096 1885 4 milliseconds cell single block physical read 8192 3681 8 milliseconds cell single block physical read 16384 2562 16 milliseconds cell single block physical read 32768 257 32 milliseconds cell single block physical read 65536 52 65 milliseconds cell single block physical read 131072 3 131 milliseconds cell single block physical read 262144 0 262 milliseconds cell single block physical read 524288 1 524 milliseconds

Virtualized:
EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT ---------------------------------------- --------------- ---------- ------------------------------ cell single block physical read 1 0 1 microsecond cell single block physical read 2 0 2 microseconds cell single block physical read 4 0 4 microseconds cell single block physical read 8 0 8 microseconds cell single block physical read 16 0 16 microseconds cell single block physical read 32 0 32 microseconds cell single block physical read 64 0 64 microseconds cell single block physical read 128 1 128 microseconds cell single block physical read 256 322113 256 microseconds cell single block physical read 512 105055 512 microseconds cell single block physical read 1024 1822 1 millisecond cell single block physical read 2048 813 2 milliseconds cell single block physical read 4096 681 4 milliseconds cell single block physical read 8192 283 8 milliseconds cell single block physical read 16384 231 16 milliseconds cell single block physical read 32768 64 32 milliseconds cell single block physical read 65536 11 65 milliseconds cell single block physical read 131072 3 131 milliseconds

In the first run we see more reads around 8ms which confirms the previous guess that we had more flash cache hit on the second run.
The waits between 128 and 512 milliseconds are from the cell flash storage and this is where I want to see if virtualization has an overhead.
I’ve put it in color there where it’s easier to visualize that most of the reads are in the 128-256 range. Bare Metal in blue, OVM in orange.

In Bare Metal, most of the reads are faster than 256 microseconds. In virtualized there are some significant reads are above. This may be cause by virtualization but anyway that’s not a big difference. I don’t think that virtualization overhead is an important criteria when choosing how to install your Exadata. Storage capacity planning is the major criteria: consolidate all storage in two diskgroups (DATA and RECO) for all databases, or partition them for each cluster. choice is about manageability and agility in provisioning vs. licence optimization.

Cet article Exadata X-5 Bare Metal vs. OVM performance est apparu en premier sur Blog dbi services.

↧

DBA essentials workshop with release Oracle 12c is Available

July 29, 2016, 1:23 am

≫ Next: Oracle serializable is not serializable

≪ Previous: Exadata X-5 Bare Metal vs. OVM performance

It’s with pleasure to announced dbi services have upgrade the infrastructure for Oracle DBA essentials workshop.
We have migrate the release of Oracle 11g R2 to the latest version Oracle 12c features and install the last PSU (April 2016).

During this course you understand different topics:

Understand the Oracle database architecture
Install and Manage Oracle database
Administer Oracle databases with dbi expert certified OCM or ACE director

For more information about the workshop, please click on the link

Oracle Database release:
SQL> select product, version, status from product_component_version where product like '%Database%'; PRODUCT VERSION STATUS ------------------------------------------ -------------------- ------------------------------ Oracle Database 12c Enterprise Edition 12.1.0.2.0 64bit Production

Last PSU installed:
SQL> select patch_id, version, status, description from dba_registry_sqlpatch; PATCH_ID VERSION STATUS DESCRIPTION ---------- -------------------- --------------- ------------------------------------------------------ 22291127 12.1.0.2 SUCCESS Database Patch Set Update : 12.1.0.2.160419 (22291127)

Cet article DBA essentials workshop with release Oracle 12c is Available est apparu en premier sur Blog dbi services.

↧

Oracle serializable is not serializable

July 30, 2016, 3:17 pm

≫ Next: ODA X6 command line and Web Console

≪ Previous: DBA essentials workshop with release Oracle 12c is Available

Did you know that when you set isolation level to SERIALIZABLE, it is not serializable but SNAPSHOT? This isolation levels is lower than serializable. I’ve never thought about it until I read Markus Winand slides about transactions. I recommend every developer or DBA to read those slides. This post is there to illustrate write skew in Oracle.

Let’s show an example on SCOTT.EMP table. Let’s say there’s a HR directive to increase one of department 10 employees salary so that total salaries for the department is 9000.
Now let’s imagine that two HR users received the directive at the same time.

User A checks the salaries:
23:18:33 SID=365> select ename,sal from EMP where deptno=10; ENAME SAL ---------- ---------- CLARK 2450 KING 5000 MILLER 1300
The sum is 8750 so User A decides to increase MILLER’s salary with additional 250.

However, to be sure that he is the only one to do that, he starts a transaction in SERIALIZABLE isolation level, checks the sum again, and do the update:
23:18:40 SID=365> set transaction isolation level serializable; Transaction set. 23:18:41 SID=365> select sum(sal) from EMP where deptno=10; SUM(SAL) ---------- 8750 23:18:44 SID=365> update EMP set sal=sal+250 where ename='MILLER'; 1 row updated.

Now at the same time, User B is doing the same but chose to increase CLARK’s salary:

23:18:30 SID=12> set transaction isolation level serializable; Transaction set. 23:18:51 SID=12> select sum(sal) from EMP where deptno=10; SUM(SAL) ---------- 8750 23:18:53 SID=12> update EMP set sal=sal+250 where ename='CLARK'; 1 row updated.

Note that there is no “ORA-08177: can’t serialize access for this transaction” there because the updates occurs on different rows.

The User A checks again the sum and then commits his transaction:

23:18:46 SID=365> select sum(sal) from EMP where deptno=10; SUM(SAL) ---------- 9000 23:19:04 SID=365> commit; Commit complete.

And so does the User B:

23:18:55 SID=12> select sum(sal) from EMP where deptno=10; SUM(SAL) ---------- 9000 23:19:08 SID=12> commit; Commit complete.

However, once you commit, the result is different:

23:19:09 SID=12> select sum(sal) from EMP where deptno=10; SUM(SAL) ---------- 9250

Actually, what Oracle calls SERIALIZABLE here is only SNAPSHOT isolation level. You see data without the concurrent changes that have been commited after the beginning of your transaction. And you cannot modify a row that has been modified by another session. However, nothing prevents that what you have read is modified by another session. You don’t see those modification, but they can be commited.

The definition of serializability requires that the result is the same when transactions occurs one after the other. Here, if User A had commited before the start of User B transaction, the latter would have seen that the total were already at 9000.

In this example, if you want to prevent write skew you need to lock the table in Share mode. Locking the rows (with select for update) is sufficient to prevent concurrent updates, but then another user can insert a new employee which brings the total salary higher. In addition to that, row locks are exclusive and you don’t want readers to block readers. Locking a range (DEPTNO=10) is not possible in Oracle. So the solution is to lock the table.

It seems that only PostgreSQL (version >= 9.1) is able to guarantee true serializability without locking.

Cet article Oracle serializable is not serializable est apparu en premier sur Blog dbi services.

↧

ODA X6 command line and Web Console

August 27, 2016, 12:33 pm

≫ Next: ODA X6 database classes and shapes

≪ Previous: Oracle serializable is not serializable

The ODA X6 comes with a new command line (odacli) which replaces oakcli, and with a small web console which can display information about the appliance, the databases and the provisioning jobs. It also has the possibility to create a database, but this is for next blog post. In this post I’ll show which information are displayed once the ODA is installed.

The examples here come from ODA X6 version: 12.1.2.7.0

Appliance

The first screen is about the appliance information, the ones that you define when installed the ODA:

The same information can be displayed from command line with odacli describe-appliance:

[root@odax6m ~]# odacli describe-appliance Appliance Information ---------------------------------------------------------------- ID: bb8f0eec-0f5c-4319-bade-75770848b923 Platform: OdaliteM Data Disk Count: 2 CPU Core Count: 20 Created: Aug 26, 2016 2:51:26 PM System Information ---------------------------------------------------------------- Name: odax6m Domain Name: oracle.democenter.arrowcs.ch Time Zone: Europe/Zurich DB Edition: EE DNS Servers: 172.22.1.9 NTP Servers: 172.22.1.9 Disk Group Information ---------------------------------------------------------------- DG Name Redundancy Percentage ------------------------- ------------------------- ------------ Data Normal 75 Reco Normal 25

An important thing to note here is that the choice between Standard Edition and Enterprise Edition is at appliance level: you cannot mix.
There’s also no mention of virtualization because ODA X6 2S and 2M are only bare-metal.

odacli list-networks

[root@odax6m ~]# odacli list-networks ID Name NIC IP Address Subnet Mask Gateway ---------------------------------------- -------------------- ---------- ------------------ ------------------ ------------------ ffcf7d89-8074-4342-9f19-5e72ed695ce7 Private-network priv0 192.168.16.24 255.255.255.240 71a422bc-39d3-483c-b79b-ffe25129dfd2 Public-network btbond1 172.22.1.23 255.255.255.224 172.22.1.2

I’ve no Auto Service Request configured here:
[root@odax6m ~]# odacli describe-asr Aug 27, 2016 8:56:33 PM com.oracle.oda.dcscli.commands.AsrCommand$getAsr run SEVERE: No asr found

Databases

The second screen is about the databases:

From command line you have information about the ORACLE_HOMEs and databases.

[root@odax6m ~]# odacli list-dbhomes ID Name DB Version Home Location ---------------------------------------- -------------------- ---------- --------------------------------------------- 67419075-d1f9-4c2e-85b1-c74430e35120 OraDB12102_home1 12.1.0.2 /u01/app/oracle/product/12.1.0.2/dbhome_1 cf76a90b-f9e3-44b2-9b43-56111c1785e4 OraDB12102_home2 12.1.0.2 /u01/app/oracle/product/12.1.0.2/dbhome_2 adcbe8bf-f26f-4ab7-98a1-0abcd4412305 OraDB11204_home1 11.2.0.4 /u01/app/oracle/product/11.2.0.4/dbhome_1

[root@odax6m ~]# odacli list-databases ID DB Name DB Version CDB Class Shape Storage Status ---------------------------------------- ---------- ---------- ---------- -------- -------- ---------- ---------- 4c182ffb-3e4a-45c0-a6c6-15d5e9b7b2b9 dbee1 12.1.0.2 false OLTP odb4 ACFS Configured 5564ea51-fc93-46f2-9188-c13c23caba94 odb1s 12.1.0.2 true OLTP odb1s ACFS Configured 26c2213d-5992-4b2b-94b0-2d0f4d0f9c2d dbee11g1 11.2.0.4 false OLTP odb2 ACFS Configured

You can get more detail about one database:

[root@odax6m ~]# odacli describe-dbhome -i 67419075-d1f9-4c2e-85b1-c74430e35120 DB Home details ---------------------------------------------------------------- ID: 67419075-d1f9-4c2e-85b1-c74430e35120 Name: OraDB12102_home1 Version: 12.1.0.2 Home Location: /u01/app/oracle/product/12.1.0.2/dbhome_1 Created: Aug 26, 2016 2:51:26 PM

[root@odax6m ~]# odacli describe-database -i 4c182ffb-3e4a-45c0-a6c6-15d5e9b7b2b9 Database details ---------------------------------------------------------------- ID: 4c182ffb-3e4a-45c0-a6c6-15d5e9b7b2b9 Description: dbee1 DB Name: dbee1 DB Version: 12.1.0.2 DBID: 2933563624 CDB: false PDB Name: PDB Admin User Name: Class: OLTP Shape: odb4 Storage: ACFS CharacterSet: DbCharacterSet(characterSet=AL32UTF8, nlsCharacterset=AL16UTF16, dbTerritory=AMERICA, dbLanguage=AMERICAN) Home ID: 67419075-d1f9-4c2e-85b1-c74430e35120 Console Enabled: false Created: Aug 26, 2016 2:51:26 PM

Activity

Here is the log of what has been done on the ODA:

[root@odax6m ~]# odacli list-jobs ID Description Created Status ---------------------------------------- ------------------------------ ------------------------- ---------- 1b99d278-6ab4-4ead-a5f8-f112c74a8f97 Provisioning service creation Aug 26, 2016 2:51:26 PM Success f0ac9a2c-ba37-412c-8a81-9cc7cb301417 Database service creation with db name: odb1s Aug 26, 2016 4:03:39 PM Success dec37817-feb7-46e5-b991-b23362268cb1 Database service creation with db name: dbee11g1 Aug 26, 2016 5:09:33 PM Success

And we have more info about the steps executed for one job:

Same in command line:

[root@odax6m ~]# odacli describe-job -i 1b99d278-6ab4-4ead-a5f8-f112c74a8f97 Job details ---------------------------------------------------------------- ID: 1b99d278-6ab4-4ead-a5f8-f112c74a8f97 Description: Provisioning service creation Status: Success Created: 26.08.2016 14:51:26 Message: Task Name Start Time End Time Status ---------------------------------------- ------------------------- ------------------------- ---------- Setting up Network Aug 26, 2016 2:51:27 PM Aug 26, 2016 2:51:27 PM Success Creating group :asmdba Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Creating group :asmoper Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Creating group :asmadmin Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Creating group :dba Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Creating group :dbaoper Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Creating group :oinstall Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Creating user :grid Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Creating user :oracle Aug 26, 2016 2:51:38 PM Aug 26, 2016 2:51:38 PM Success Setting up ssh equivalance Aug 26, 2016 2:51:39 PM Aug 26, 2016 2:51:39 PM Success Gi Home creation Aug 26, 2016 2:54:49 PM Aug 26, 2016 2:57:54 PM Success Creating GI home directories Aug 26, 2016 2:54:49 PM Aug 26, 2016 2:54:49 PM Success Cloning Gi home Aug 26, 2016 2:54:49 PM Aug 26, 2016 2:57:54 PM Success GI stack creation Aug 26, 2016 2:57:54 PM Aug 26, 2016 3:08:44 PM Success Configuring GI Aug 26, 2016 2:57:54 PM Aug 26, 2016 2:58:21 PM Success Running GI root scripts Aug 26, 2016 2:58:21 PM Aug 26, 2016 3:05:03 PM Success Running GI config assistants Aug 26, 2016 3:05:03 PM Aug 26, 2016 3:05:34 PM Success Creating RECO disk group Aug 26, 2016 3:10:02 PM Aug 26, 2016 3:10:16 PM Success Creating volume reco Aug 26, 2016 3:10:16 PM Aug 26, 2016 3:10:26 PM Success Creating volume datdbee1 Aug 26, 2016 3:10:26 PM Aug 26, 2016 3:10:35 PM Success Creating ACFS filesystem for RECO Aug 26, 2016 3:10:35 PM Aug 26, 2016 3:10:42 PM Success Creating ACFS filesystem for DATA Aug 26, 2016 3:10:42 PM Aug 26, 2016 3:10:49 PM Success Db Home creation Aug 26, 2016 3:10:49 PM Aug 26, 2016 3:13:40 PM Success Creating DbHome Directory Aug 26, 2016 3:10:49 PM Aug 26, 2016 3:10:49 PM Success Extract DB clones Aug 26, 2016 3:10:49 PM Aug 26, 2016 3:12:29 PM Success Enable DB options Aug 26, 2016 3:12:29 PM Aug 26, 2016 3:12:38 PM Success Clone Db home Aug 26, 2016 3:12:38 PM Aug 26, 2016 3:13:37 PM Success Run Root DB scripts Aug 26, 2016 3:13:37 PM Aug 26, 2016 3:13:37 PM Success Database Service creation Aug 26, 2016 3:13:40 PM Aug 26, 2016 3:19:43 PM Success Database Creation Aug 26, 2016 3:13:40 PM Aug 26, 2016 3:17:58 PM Success Running DataPatch Aug 26, 2016 3:18:33 PM Aug 26, 2016 3:19:43 PM Success create Users tablespace Aug 26, 2016 3:19:43 PM Aug 26, 2016 3:19:46 PM Success

Yes, this is the ODA installation. Half an hour to setup the OS, install Grid Infrastructure, setup the storage and create a first database.

Refresh

Those that all the screens are not refreshed automatically even when you navigate through them. Don’t forget the ‘Refresh’ button (circular arrow) on top-right.

JSON

You can also build your own interface from the JSON format:

[root@odax6m ~]# odacli list-jobs -j [ { "jobId" : "1b99d278-6ab4-4ead-a5f8-f112c74a8f97", "status" : "Success", "message" : null, "createTimestamp" : 1472215886601, "description" : "Provisioning service creation" }, { "jobId" : "f0ac9a2c-ba37-412c-8a81-9cc7cb301417", "status" : "Success", "message" : null, "createTimestamp" : 1472220219016, "description" : "Database service creation with db name: odb1s" }, { "jobId" : "dec37817-feb7-46e5-b991-b23362268cb1", "status" : "Success", "message" : null, "createTimestamp" : 1472224173747, "description" : "Database service creation with db name: dbee11g1" } ]

So what?

ODA is for easy and fast provisioning and the GUI that was missing is finally there. Of course, it looks very simple, but that’s the goal of the appliance: setup quickly a standardized environment. ODA X6-2S is cheap and has good performance for small databases. You may find equivalent hardware, but can you build and install a stable hardware, OS and database in 30 minutes?

Cet article ODA X6 command line and Web Console est apparu en premier sur Blog dbi services.

↧

ODA X6 database classes and shapes

August 29, 2016, 6:47 am

≫ Next: Filenames in AWR reports

≪ Previous: ODA X6 command line and Web Console

On the Oracle Database Appliance, like on the Oracle public Cloud, you define the CPU capacity with ‘shapes’. On the latest ODA, the X6, we have a new interface to provision a database. Let’s look at the different shapes available.

ODACLI

You can provision a new database with the command line ODACLI which replaces the OAKCLI you used in ODA X5:
[root@odax6m ~]# odacli create-database Usage: create-database [options] Options: * --adminpassword, -mm Password for SYS,SYSTEM and PDB Admin --cdb, -c Create Container Database Default: false --dbclass, -cl Database Class OLTP/DSS/IMDB Default: OLTP --dbconsole, -co Enable Database Console Default: false --dbhomeid, -dh Database Home ID (Use Existing DB Home) * --dbname, -n Database Name --dbshape, -s Database Shape{odb1,odb2,odb3 etc.} Default: odb1 --dbstorage, -r Database Storage {ACFS|ASM} Default: ACFS --dbtype, -y Database Type {SI|RAC} Default: SI --help, -h get help Default: false --instanceonly, -io Create Instance Only (For Standby) Default: false --json, -j json output Default: false --pdbadmin, -d Pluggable Database Admin User Default: pdbadmin --pdbname, -p Pluggable Database Name Default: pdb1 --targetnode, -g Node Number (for single-instance databases) Default: 0 --version, -v Database Version Default: 12.1.0.2

ODA WebConsole

But the ODA X6 has also a small graphical interface from the web console.

12c multitenant is the default, but you can choose.

Edition

You don’t have the choice when you create the database. You install the ODA in Standard or Enterprise and then you cannot change.

Versions

Two database versions are available: 11.2.0.4 and 12.1.0.2

You choose ODA to get a stable, certified and supported system so it make sense to run only supported versions with latest PSU. If you have older versions, you must upgrade. Set optimizer_features_enable to previous if your application was not tuned for newer versions. Very often, when an ISV do not certify his software it’s because of optimizer regressions. With proper testing and optimizer settings you should be able to upgrade any application without the risk of regression.

Templates

There are four DBCA templates available

Standard Edition or Enterprise Edition
Multitenant or non-CDB

The main difference between Enterprise Edition and Standard Editions are:
Options OMS,SPATIAL,CWMLITE,DV are installed in Enterprise Edition but not in Standard Edition
fast_start_mttr_target=300 in Enterprise Edition (feature not supported in Standard Edition)

The main difference between multitenant and non-CDB:
Options JSERVER,IMEDIA,ORACLE_TEXT,APEX are installed in a CDB an not in a non-CDB
maxdatafiles=1024 in CDB (100 in non-CDB)

All templates are configured with filesystem_io_options=setall and use_large_pages=only

Following underscore parameters are set for all ODA templates:
*._datafile_write_errors_crash_instance=FALSE *._disable_interface_checking=TRUE *._enable_NUMA_support=FALSE *._file_size_increase_increment=2143289344 *._gc_policy_time=0 *._gc_undo_affinity=FALSE

Note that both 12c and 11g are available in Enterprise Edition as well as Standard Edition (can even be Standard Edition One for 11g).
Of course, CDB is only for 12c.

Shapes

As in the Oracle Public Cloud, the CPU and Memory comes in shapes:

The choice is the number of core. The cores are hyperthreaded, which means that odb1 will have cpu_count=2. And it is set in spfile. Note that at install no resource manager plan is active so instance caging will not occur except during the automatic maintenance window… My recommandation is to set a plan. In 12.1.0.2 Standard Edition resource manager is implicitly activated.

ODA X6-2 processors are Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz. Here is an example of the LIOPS you can reach when running on all the 40 threads of a X6-2M:

Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 39.9 545.6 0.00 25.43 DB CPU(s): 38.8 530.9 0.00 24.75 Logical read (blocks): 18,494,690.4 252,862,769.1
This is 18 million logical reads per seconds in this 2 sockets (2s10c20t) appliance. Half of it on the X6-2S which has one socket 1s10c20t.

The core factor for those processors is 0.5 which means that you can run an Enterprise Edition ‘odb2′ with a single processor license (public price 47,500$) and you can run 4 sessions in CPU which means more that you can do nearly 3 million logical reads per second, as here:

Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 4.0 54.6 0.00 13.80 DB CPU(s): 4.0 54.5 0.00 13.78 Logical read (blocks): 2,901,991.5 39,660,331.1

Those shapes are defined as:

Given the high LIOPS and the available memory, this entry-level appliance can be sufficient for most of medium OLTP workload.

Classes

Three classes are defined to derive the database parameters from the shape.

The SGA/PGA is calculated from the shape memory and a class factor.
OLTP gives 50% to SGA and 25% to PGA which means that for example a odb4 has sga_target=16 and pga_aggregate_target=8G
DSS gives 25% to SGA and 50% to PGA

Note that OLTP is the only one available in Standard Edition. This does not mean that you can run only OLTP. You can change memory settings later (DSS usually need more PGA than SGA) and you have very good storage bandwidth and IOPS (NVMe access to SSD). This setting is more an indication that most of datawarehouses need features available only on Enterprise Edition such as parallel query, partitioning, bitmap indexes.

ASM or ACFS?

The template shapes above define a 100GB database. When you create a new database you have the choice to put it directly on +DATA and +RECO, or create a 100GB ADVM volume and ACFS filesystem that will be mounted under /u02/app/oracle/oradata. If you choose ACFS the FRA and REDO will be created under the /u03/app/oracle mount point which is a common ACFS.

The default is ACFS but you should think about it. For production, best performance is ASM. You have SSD to reduce avoid disk latency. You have NVMe to reduce CPU latency. You don’t want to add the ACFS layer. The maximum IOPS we observe is 10 times higher with datafiles directly on ASM:

@FranckPachot Don't know? We get 75k IOPS on ACFS and 700k IOPS on pure ASM. /cc @kevinclosson pic.twitter.com/TcLzUsOh0d

— Marco Mischke (@DBAMarco) August 29, 2016

For test databases, where you use snapshot features, especially with multitenant, you may choose ACFS. However, why not create the CDB in ASM and use ACFS for the PDBs you will want to snapshot? No need for that additional layer for the CDB files. Better to isolate the master and clones for a specific environment into its own ACFS.

And anyway, ODA X6-2S and X6-2M are very interesting for Standard Edition, and you cannot use snapshots nor any ACFS features for a database in Standard Edition.

Storage performance is truly amazing. At 100000 IOPS we have 99% single block reads faster than 256 microseconds and 97% faster than 128 us. At 800000 IOPS here are the figures:

% of Waits ----------------------------------------------- Total Event Waits <1ms <2ms <4ms <8ms <16ms <32ms 1s ------------------------- ------ ----- ----- ----- ----- ----- ----- ----- ----- db file parallel read 6.9M 4.0 74.1 21.7 .2 .0 .0 .0 db file sequential read 18.4M 83.1 16.3 .7 .0 .0 .0

So what?

It’s nice to have an easy GUI to provision a database on ODA. However there are some limits with it:

Be careful on the defaults. They may not fit what you want. Do you want you databases on ACFS?
Not all operations can be done though the GUI: you can create but not delete a database.

But there’s more. Performance is there. You can run application that need high performance.

Do you know any other solution which gives you a fully certified system installed in few hours with databases ready? With very good hardware and managed software costs (NUP, Standard Edition in socket metric or Entrerprise Edition capacity-on-demand by multiple of 1 processor license).
You need high-availability? In Standard Edition you cannot use Data Guard. In Standard Edition you can buy Dbvisit standby which gives you switchover and failover (Recovery Point Objective of few minutes) to a second ODA or to a cloud service. Of course, you can build or buy custom scripts to manage the manual standby. However, if you go to ODA you probably appreciate easy and robust software.

Cet article ODA X6 database classes and shapes est apparu en premier sur Blog dbi services.

↧

Filenames in AWR reports

August 29, 2016, 12:33 pm

≫ Next: The fastest way to get the Oracle sample schemas

≪ Previous: ODA X6 database classes and shapes

If you have read my latest blog posts, you know I’ve measured IOPS with SLOB to estimate ACFS overhead on a fast storage. This blog is about something I learned after wasting one hour on the result.

Here is how I did my tests:

Create a SLOB database in ACFS
Run SLOB PIO tests and tag the AWR report as ‘ACFS’
Move datafile to +DATA
Run SLOB PIO tests and tag the AWR report as ‘ASM’

Of course, I’ve scripted to run several tests varying the number of sessions, work unit, etc. while I was doing something more productive.

While done, I got a set of AWR report and the first task was to check that they were consistent. But they were not. The datafile in ‘File IO Stats’ section did not match the tag I’ve put in the file name. First I suspected a bug in my script with bad tagging or failed datafile move. I had to read the alert.log to get that my tagging was good but filename in AWR reports was wrong. I finally looked at AWR views to understand why the filename was wrong and understood the problem:

SQL> desc DBA_HIST_DATAFILE; Name Null? Type ----------------------------------------- -------- ---------------------------- DBID NOT NULL NUMBER FILE# NOT NULL NUMBER CREATION_CHANGE# NOT NULL NUMBER FILENAME NOT NULL VARCHAR2(513) TS# NOT NULL NUMBER TSNAME VARCHAR2(30) BLOCK_SIZE NUMBER CON_DBID NUMBER CON_ID NUMBER

There’s no SNAP_ID. AWR do not store the history of file names. We can suppose that it stores only the latest filename, but then my reports would be good as they were generated immediately after the snapshot. Or that the first name stays, but I had some reports with ‘+DATA’.

Then, I grepped for ‘WRH$_HISTORY’ in ORACLE_HOME/rdbms/admin and came upon this:

dbmsawr.sql: -- This routine updates WRH$_DATAFILE rows for the datafile name and dbmsawr.sql: -- WRH$_DATAFILE with the current information in database.

There is an update_datafile_info procedure here in the dbms_workload_repository and the comment says something like:
This change will be captured at max after some -- (generally 50) snapshots. So the AWR and AWR report may be wrong with -- respect to data file name or tablespace name for that duration.

I love to work with Oracle. All information is there if you know where to look at.

So if you want to rely on filename in an AWR report after a move, you should run this procedure before taking the report. And you should run this report before the next datafile move.

Here is the example:

SQL> exec dbms_workload_repository.create_snapshot; PL/SQL procedure successfully completed. SQL> select file#,filename from DBA_HIST_DATAFILE where file#=6; FILE# FILENAME ---------- -------------------------------------- 6 /u01/DEMO/oradata/DEMO14/users01.dbf SQL> select snap_id,file#,filename from DBA_HIST_FILESTATXS where file#=6 order by snap_id fetch first 10 rows only; SNAP_ID FILE# FILENAME ---------- ---------- -------------------------------------- 1244 6 /u01/DEMO/oradata/DEMO14/users01.dbf 1245 6 /u01/DEMO/oradata/DEMO14/users01.dbf 1246 6 /u01/DEMO/oradata/DEMO14/users01.dbf 1247 6 /u01/DEMO/oradata/DEMO14/users01.dbf 1248 6 /u01/DEMO/oradata/DEMO14/users01.dbf 1249 6 /u01/DEMO/oradata/DEMO14/users01.dbf 6 rows selected.

My file is user01 and this is what is stored in AWR.

I rename it to users02 (thanks to 12c online move)

SQL> alter database move datafile '/u01/DEMO/oradata/DEMO14/users01.dbf' to '/u01/DEMO/oradata/DEMO14/users02.dbf'; Database altered.

but AWR is not aware of the change even after a snapshot:

You have to wait for those 50 snapshots or run the update:

SQL> exec dbms_workload_repository.update_datafile_info; PL/SQL procedure successfully completed.

SQL> select file#,filename from DBA_HIST_DATAFILE where file#=6;

FILE# FILENAME
---------- --------------------------------------
6 /u01/DEMO/oradata/DEMO14/users02.dbf

SQL> select snap_id,file#,filename from DBA_HIST_FILESTATXS where file#=6 order by snap_id fetch first 10 rows only;

SNAP_ID FILE# FILENAME
---------- ---------- --------------------------------------
1244 6 /u01/DEMO/oradata/DEMO14/users02.dbf
1245 6 /u01/DEMO/oradata/DEMO14/users02.dbf
1246 6 /u01/DEMO/oradata/DEMO14/users02.dbf
1247 6 /u01/DEMO/oradata/DEMO14/users02.dbf
1248 6 /u01/DEMO/oradata/DEMO14/users02.dbf
1249 6 /u01/DEMO/oradata/DEMO14/users02.dbf
1250 6 /u01/DEMO/oradata/DEMO14/users02.dbf

But as you see no history about previous names.

Note that if you look at the table behind the view, there’s a SNAP_ID but it’s not part of the primary key. It is used by the purge procedures.

Cet article Filenames in AWR reports est apparu en premier sur Blog dbi services.

↧

The fastest way to get the Oracle sample schemas

August 29, 2016, 10:52 pm

≫ Next: Rolling Invalidate Window Exceeded

≪ Previous: Filenames in AWR reports

Do you need the Oracle sample schemas to do a quick test or demonstration? And, as always, you did not install the sample schemas when you did the setup of your environment? The probably fastest way to get them installed is to download them from github. Installation instructions are there as well. Have fun …

Cet article The fastest way to get the Oracle sample schemas est apparu en premier sur Blog dbi services.

↧

Rolling Invalidate Window Exceeded

August 30, 2016, 2:05 pm

≫ Next: Result cache side effects on number of calls

≪ Previous: The fastest way to get the Oracle sample schemas

Today I was doing a hard parse storm post-mortem analysis. One hypothesis was rolling invalidation causing invalidation, but figures didn’t match. I often reproduce the hypothesis to check the numbers to be sure I interpret them correctly. Especially the timestamps in V$SQL_SHARED_CURSOR.REASON. And as it may help others (including myself in the future) I share the test case.

I create a table with one row (12c online statistics gathering, so num_rows is 1) and then insert one more row.
21:31:26 SQL> create table DEMO as select * from dual; Table created. 21:31:26 SQL> insert into DEMO select * from dual; 1 row created. 21:31:26 SQL> commit; Commit complete.
I run a query on the table. I don’t care about the result, so let’s put it something that will be useful later: the UTC time as the number of seconds since Jan 1st, 1970 (aka Epoch)
21:32:52 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585572 1472585572
The execution plan cardinality estimation is 1 row as this is what is in object statistics.
21:32:52 SQL> select * from table(dbms_xplan.display_cursor(null,null)); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID 61x2h0y9zv0r6, child number 0 ------------------------------------- select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO Plan hash value: 4000794843 ------------------------------------------------------------------ | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 2 (100)| | | 1 | TABLE ACCESS FULL| DEMO | 1 | 2 (0)| 00:00:01 | ------------------------------------------------------------------
I gather statistics with all default attributes, so rolling invalidation occurs.
21:32:52 SQL> exec dbms_stats.gather_table_stats(user,'DEMO'); PL/SQL procedure successfully completed.

At this time, the cursor has been parsed only once:
21:32:52 SQL> select invalidations,loads,parse_calls,executions,first_load_time,last_load_time,last_active_time from v$sql where sql_id='61x2h0y9zv0r6'; INVALIDATIONS LOADS PARSE_CALLS EXECUTIONS FIRST_LOAD_TIME LAST_LOAD_TIME LAST_ACTIVE_TIME ------------- ---------- ----------- ---------- ------------------- ------------------- ------------------ 0 1 1 1 2016-08-30/21:32:51 2016-08-30/21:32:51 30-AUG-16 21:32:51

By default the invalidation window is 5 hours. I don’t want to wait so I set it to something shorter- 15 seconds:
21:32:54 SQL> alter system set "_optimizer_invalidation_period"=15; System altered.

There will not be any invalidation until the next execution. To prove it I wait 20 seconds, run the query again and check the execution plan:
21:33:12 SQL> select (sysdate-date'1970-01-01')*24*3600 from DEMO; (SYSDATE-DATE'1970-01-01')*24*3600 ---------------------------------- 1472592792 1472592792 21:33:12 SQL> select * from table(dbms_xplan.display_cursor('61x2h0y9zv0r6',null)); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID 61x2h0y9zv0r6, child number 0 ------------------------------------- select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO Plan hash value: 4000794843 ------------------------------------------------------------------ | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 2 (100)| | | 1 | TABLE ACCESS FULL| DEMO | 1 | 2 (0)| 00:00:01 | ------------------------------------------------------------------
This is still the old cursor (child number 0) with old stats (num_rows=1)

However, from this point rolling invalidation occurs: a random timestamp is generated within the rolling window (15 seconds here – 5 hours in default database).

I don’t know how to see this timestamp at that point (comments welcome) so I run the query several times within this 15 seconds window to see when it occurs:
21:33:16 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585596 1472585596 21:33:19 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585599 1472585599 21:33:22 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585602 1472585602 21:33:25 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585605 1472585605 21:33:28 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585608 1472585608 21:33:31 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585611 1472585611
After those runs, I check that I have a new execution plan with new estimation from new statistics (num_rows=2):
21:33:31 SQL> select * from table(dbms_xplan.display_cursor('61x2h0y9zv0r6',null)); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID 61x2h0y9zv0r6, child number 0 ------------------------------------- select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO Plan hash value: 4000794843 ------------------------------------------------------------------ | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 2 (100)| | | 1 | TABLE ACCESS FULL| DEMO | 1 | 2 (0)| 00:00:01 | ------------------------------------------------------------------ SQL_ID 61x2h0y9zv0r6, child number 1 ------------------------------------- select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO Plan hash value: 4000794843 ------------------------------------------------------------------ | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 2 (100)| | | 1 | TABLE ACCESS FULL| DEMO | 2 | 2 (0)| 00:00:01 | ------------------------------------------------------------------
Yes, I have a new child cursor, child number 1. A new cursor means that I have a reason in V$SQL_SHARED_CURSOR:
21:33:31 SQL> select child_number,reason from v$sql_shared_cursor where sql_id='61x2h0y9zv0r6'; CHILD_NUMBER REASON ------------ -------------------------------------------------------------------------------- 0 <ChildNode><ChildNumber>0</ChildNumber><ID>33</ID><reason>Rolling Invalidate Win dow Exceeded(2)</reason><size>0x0</size><details>already_processed</details></Ch ildNode><ChildNode><ChildNumber>0</ChildNumber><ID>33</ID><reason>Rolling Invali date Window Exceeded(3)</reason><size>2x4</size><invalidation_window>1472585604< /invalidation_window><ksugctm>1472585607</ksugctm></ChildNode> 1
Child cursor number 0 has not been shared because of rolling invalidation. The invalidation_window number, 1472585604, is the timestamp set by rolling invalidation, set at first parse call after stats gathering, and defined within the rolling window that follows. After this one (1472585604 is 21:33:24 in my GMT+2 timezone) the cursor will not be shared and a new hard parse occurs. I think that ksugctm is the timestamp when the new cursor is created. 1472585607 is 21:33:27 here in Switzerland. You see the corresponding timestamps in V$SQL:
21:33:31 SQL> select invalidations,loads,parse_calls,executions,first_load_time,last_load_time,last_active_time from v$sql where sql_id='61x2h0y9zv0r6'; INVALIDATIONS LOADS PARSE_CALLS EXECUTIONS FIRST_LOAD_TIME LAST_LOAD_TIME LAST_ACTIVE_TIME ------------- ---------- ----------- ---------- ------------------- ------------------- ------------------ 0 1 5 5 2016-08-30/21:32:51 2016-08-30/21:32:51 30-AUG-16 21:33:24 0 1 2 2 2016-08-30/21:32:51 2016-08-30/21:33:27 30-AUG-16 21:33:30

Ok. Important thing is that the ‘rolling invalidation’ is not an invalidation (as V$SQL.INVALIDATIONS=0) of the cursor, but just non-sharing of the child.

If we gather statistics with immediate invalidation, it’s different:
21:33:31 SQL> exec dbms_stats.gather_table_stats(user,'DEMO',no_invalidate=>false); PL/SQL procedure successfully completed. 21:33:34 SQL> select (cast(sys_extract_utc(current_timestamp) as date)-date'1970-01-01')*24*3600 from DEMO; (CAST(SYS_EXTRACT_UTC(CURRENT_TIMESTAMP)ASDATE)-DATE'1970-01-01')*24*3600 ------------------------------------------------------------------------- 1472585614 1472585614 21:33:34 SQL> select child_number,reason from v$sql_shared_cursor where sql_id='61x2h0y9zv0r6'; CHILD_NUMBER REASON ------------ -------------------------------------------------------------------------------- 0 <ChildNode><ChildNumber>0</ChildNumber><ID>33</ID><reason>Rolling Invalidate Win dow Exceeded(3)</reason><size>2x4</size><invalidation_window>1472585604</invalid ation_window><ksugctm>1472585607</ksugctm></ChildNode><ChildNode><ChildNumber>0< /ChildNumber><ID>33</ID><reason>Rolling Invalidate Window Exceeded(2)</reason><s ize>0x0</size><details>already_processed</details></ChildNode>
I’ve only one child here, a new one, and I’m not sure the reason has a meaning.
21:33:34 SQL> select invalidations,loads,parse_calls,executions,first_load_time,last_load_time,last_active_time from v$sql where sql_id='61x2h0y9zv0r6'; INVALIDATIONS LOADS PARSE_CALLS EXECUTIONS FIRST_LOAD_TIME LAST_LOAD_TIME LAST_ACTIVE_TIME ------------- ---------- ----------- ---------- ------------------- ------------------- ------------------ 1 2 1 1 2016-08-30/21:32:51 2016-08-30/21:33:33 30-AUG-16 21:33:33
This is an invalidation of the cursor. Old children cursors are removed and the proud parent is marked as invalidated 1 time.

Cet article Rolling Invalidate Window Exceeded est apparu en premier sur Blog dbi services.

↧

Result cache side effects on number of calls

September 13, 2016, 2:38 am

≫ Next: DOAG 2016, Schulungstag: Oracle Grid Infrastructure

≪ Previous: Rolling Invalidate Window Exceeded

During the execution of a SQL statement, you cannot guess how many times an operation, a predicate, or a function will be executed. This depends on the execution plan, on some caching at execution, and some other execution time decisions. Here is an example where result cache may bring some overhead by calling a function multiple times.

Here is my function:
SQL> create or replace function F return number is 2 begin 3 dbms_lock.sleep(5); 4 dbms_output.put_line('Hello World'); 5 return 255; 6 end; 7 / Function created.
The function displays ‘Hello World’ so that I can check how many times it is executed (I’ve set serveroutput on).

Obviously, on a one row table, it is called only once:
SQL> select f from dual; F ---------- 255 Hello World

Query result cache miss

I’ll run now the same query but with the result cache hint. The first execution will have to execute the query because the cache is empty at that point:

SQL> exec dbms_result_cache.flush; PL/SQL procedure successfully completed. SQL> select /*+ result_cache */ f from dual; F ---------- 255 Hello World Hello World

Here is what I wanted to show: ‘Hello World’ is displayed two times instead of one. If your function is an expensive one, then the first execution, or every cache miss, will have a performance overhead.

Query result cache hit

Now that the result is in the cache:

SQL> select id, type, status, name from v$result_cache_objects; ID TYPE STATUS NAME ---------- ---------- --------- ------------------------------------------------------------ 33 Dependency Published DEMO.F 34 Result Published select /*+ result_cache */ f from dual

and the table has not changed (it’s DUAL here :; ) further executions do not call the function anymore, which is the expected result.

SQL> select /*+ result_cache */ f from dual ; F ---------- 255

Bug or not?

Bug 21484570 has been opened for that and closed as ‘Not a bug’. There is no guarantee that the function is evaluated once, twice, more or never.
Ok, why not. That’s an implementation decision. Just think about it if you want to workaround an expensive function called for each row, then query result cache may not be the right solution (except if all tables are static and you always have cache hits).

Note that if the function is declared as deterministic, it is executed only once.

You can workaround the issue by using result cache at function level (in place, or in addition to query result cache if you need it).

SQL> create or replace function F return number RESULT_CACHE is 2 begin 3 dbms_lock.sleep(5); 4 dbms_output.put_line('Hello World'); 5 return 255; 6 end; 7 / Function created. SQL> select /*+ result_cache */ f from dual; F ---------- 255 Hello World SQL> select id, type, status, name from v$result_cache_objects; ID TYPE STATUS NAME ---------- ---------- --------- ------------------------------------------------------------ 64 Dependency Published DEMO.F 66 Result Published "DEMO"."F"::8."F"#e17d780a3c3eae3d #1 65 Result Published select /*+ result_cache */ f from dual

So, not a big problem. Just something to know. And anyway, the right design is NOT to call a function for each row because it’s not scalable. Pipeline functions should be used for that.

Cet article Result cache side effects on number of calls est apparu en premier sur Blog dbi services.

↧

DOAG 2016, Schulungstag: Oracle Grid Infrastructure

September 20, 2016, 2:11 am

≫ Next: Modern software architecture – what is a database?

≪ Previous: Result cache side effects on number of calls

Wie auch in den letzten Jahren werden wir auch dieses Jahr wieder einen Schlungstag an der DOAG ausrichten. Dieses mal wird sich alles um die Oracle Clusterware (“Infrastruktur & Middelware” –> “Oracle Grid Infrastructure”) drehen: Los geht es mit den Anforderungen an Netzwerk, Speichersysteme, Betriebssystem und Kernel Parameter, denn nur wenn die Basis auch stimmt arbeitet eine Cluster-Lösung auch verlässlich. Wie immer bei uns wird es zahlreiche Live Demos geben, denn zeigen ist immer besser als nur erzählen. Es geht dann weiter mit der Architektur, der Installation, Konfiguration und dem Betrieb der Lösung. Natürlich gehen wir auch detailliert auf die Fehlersuche und Analyse ein. Am Ende des Tages sollte jedem klar sein was genau die Oracle Clusterware ist, wofür man sie einsetzen kann und sollte und auf was genau zu achten ist. Oracle selbst setzt die Clusterware bei fast allen Engineered Systems sowieso schon ein, also lieber gleich wissen worauf es ankommt.
Wer dann immer noch das Gefühl hat mehr Praxis zu benötigen kann sich gerne unseren Workshop zum selben Thema ansehen.

Wir freuen uns jetzt schon auf die zahlreichen Besucher an der DOAG 2016.

Cet article DOAG 2016, Schulungstag: Oracle Grid Infrastructure est apparu en premier sur Blog dbi services.

↧

Modern software architecture – what is a database?

September 24, 2016, 3:03 pm

≫ Next: Fun with PL/SQL code reviews – Part 1

≪ Previous: DOAG 2016, Schulungstag: Oracle Grid Infrastructure

This blog post is focused at developers and software architects. I’m probably not writing at the right place. You’re on an infrastructure experts blog and the author is an Oracle DBA. So what can you learn from someone working on that 30 years old technology talking about that old SQL language ? You run with modern languages, powerful frameworks, multi-layer architecture, micro-services, distributed database and of course all open-source. You hate your DBA because he is the major slow-down for your agile development. You don’t want SQL. You don’t want databases. You don’t want DBA.

How can I encourage you to read this blog post? I was not always an DBA. I started as a developer, more than 20 years ago. And believe me, it was not prehistory at all. Object-Oriented design, Rapid Application Development, Automatic programming (remember C.A.S.E.?), visual programming (have you ever seen an IDE like IBM Visual Age?), query generation (early days of Business-Objects). All these evolved with more and more languages, frameworks, layers, micro-services, XML, SOA, JSON, REST,… but only one technology remained: the critial persistent data is still in a relational database and accessed by SQL.

What is a database

Most of developers think that a database is there to store and retrieve data. I’m sorry but that’s wrong. That may have been right a long time ago, with key-value storage and hierarchical databases, but that’s too old for me. When I started to work, databases were already doing far more than that. Let me explain. With those prehistoric databases, you retrieved data in the same way you stored it. You insert with a key, you fetch with that key. It is easy to explain to modern developers because they “invented” it few years ago, calling it CRUD (Create Read Update Delete). First argument of those CRUD methods is a key value. And you store unformatted data as XML or JSON associated to that value. If this is the only feature that you need, then for sure you don’t want a database.

Relational database management systems (RDBMS) are doing a lot more than that. First, you can query data in a completely different way than you inserted it. And this is the real life-cycle of data. For example, You take orders, one by one, with customer and product information for each of them. Of course you can update and read it with the order ID that has been generated, but that’s only a small use case and probably not the most critical. Warehouse users will query (and update) orders by product. Delivery users will query (and update) orders by customer. Marketing users will query by channels and many other dimensions. Finance will join with accounting. With a persistence only system, you have to code all that. Of course if you declared the mapping of associations, you can navigate through them. But the user requirement is to get a set of orders, or a quantity of products in stock, or a subset of customers, which is different from navigating through orders one by one. With a database, you dont need to code anything. With a proper data model what you inserted can be manipulated without its key value. All data that you have inserted can be accessed from any different point of view. And you don’t have to code anything for that. Imagine a Data Access Object with ‘QueryBy methods covering any combination of columns and operators.

A database system does not only store data, it processes data and provide a service to manipulate data.

SQL

SQL is not a language to code how to get the information. SQL only describes what you want. It’s a question you ask to a data service. Same idea as Uber where you enter your destination and desired car and the service manages everything for you: the path, the communication, the paiement, the security. You may not like the SQL syntax, but it can be generated. I’m not talking about generating CRUD statements here, but generating SQL syntax from a SQL semantic expressed in Java or example. There’s a very good example for that: jOOQ (look at the exemples there).

I understand that you can hate SQL for it’s syntax. SQL was build for pre-compilers, not for execution time parsing of text, and I’ll come back on that later with static SQL. But you can’t say that SQL semantic is not modern. It’s a 4th generation language that saves all the procedural coding you have to do with 3rd generation languages. SQL is a declarative language build on a mathematics theory. It goes far beyond the for() loops and if()else.

In SQL you describe the result that you want. How to retrieve the data is done by the database system. The optimizer builds the procedural code (know as the execution plan) and the execution engine takes care of everything (concurrency, maintaining redundant structures for performance, caching, multithreading, monitoring, debugging, etc). Do you really want to code all that or do you prefer to rely on a data service that does everything for you?

You know why developers don’t like SQL? Because SQL has not been designed for programmers. It was for users. The goal was that a non-programmer can ask its question to the system (such as “give me the country of the top customers having bought a specific product in last 3 months”) without the need of a developer. There was no GUI at that time, only Command Line Interface, and SQL was the User Friendly Interface to the database. Today we have GUIs and we don’t need SQL. But it is there so programmers build tools or framework to generate SQL from a programming language. Sure it is ridiculous and it would be better to have a programming language that directly calls the SQL semantic without generating plain old English text. We need a Structured Query Language (SQL) we just don’t need it to be in English.

Set vs loops

So why do people prefer to code everything in procedural language (3GL)? Because this is only what they learned. If at school you learned only loops and comparisons, then you are going to access data in loops. If you learned to think about data as sets, then you don’t need loops. Unfortunately, the set concepts are teached in mathematics classes but not in IT.

Imagine you have to print “Hello World” 5 times. Which pseudo-code so you prefer?

print("Hello World\n") print("Hello World\n") print("Hello World\n") print("Hello World\n") print("Hello World\n")

print ( "Hello World\n" + "Hello World\n" + "Hello World\n" + "Hello World\n" + "Hello World\n" )

I’ve put that in pseudo-code. I don’t want to play with String and StringBuffer here. But the idea is only to explain that if you have to process a set of things it is more efficient to process them as a set rather than one-by-one. That works for everything. And this is where databases rocks: they process sets of rows. If you have to increment the column N by one in every row of your table T, you don’t need to start a loop and increment the column row-by-row. Just ask your RDBMS data service to do it: ‘/* PLEASE */ UPDATE T set N=N+1′. The “please” is in comment because everything that is not there to describe the result is not part of SQL. You have to use hints to force the way to do it, but they are written as comments because SQL do not allow any way to tell how to do it. This was a joke of course, the “please” is not mandatory because we are talking to a machine.

ACID

I’m not sure you get all the magic that is behind:
UPDATE T set N=N+1;

it’s not a simple loop as:
for each row in T set N=N+1

The RDBMS does more than that. Imagine that there is a unique index on the column N. How many lines of code do you need to do that N=N+1 row by row and be sure that at any point you don’t have duplicates? Imagine that after updating half of the rows you encounter someone else currently updating the same row. You have to wait for his commit. But then, if he updated the value of N, do you increment the past value or the new one? You can’t increment the old one or his modification will be lost. But if you increment the new one, your previous incremented rows are inconsistent because they were based on a previous state of data.
I was talking about having an index. You have to maintain this index as well. You have to be sure that what is in cache is consistent with what is in disk. That modifications made in the cache will not be lost in case of server failure. And if you run in a cluster, those caches must be synchronized.

Coding the same as this “UPDATE T set N=N+1″ in a procedural language is not easy and can become very complex in a multi-user environment.

Maybe you have all the tools you need to generate that code. But if you code it you have to test it. Are your tests covering all concurrency cases (sessions updating or reading same rows, or different rows from same table,…). What is already coded within the database has already been tested. It’s a service and you just have to use it.

Static SQL

I claimed above that SQL is there to be pre-compiled. Yes, SQL is witten in plain text, like most of programming languages, and must be parsed, optimized, compiled. It’s not only for performance. The main reason is that you prefer to get errors at compile time than at runtime. If you put SQL in text strings in your code it will remain text until execution time when it will be prepared. And only then you will get errors. The second reason is that when the SQL is parsed, it is easy to find the dependencies. Want to see all SQL statements touching to a specific column? Do you prefer to do guess on some text search or to methodically follow dependencies?

Yes, SQL is there to be static and not dynamic. That claim may look strange for an Oracle DBA because all statements are dynamic in Oracle. Even at the time of precompilers (such as Pro*C) the statements were parsed but were put as text in the binary. And at first execution, they are parsed again and optimized. If you want the execution plan to be defined at deployment time, you have to use Outlines or SQL Plan Baselines. There is no direct way to bind the execution plan at deployment time in Oracle. In my opinion the static SQL as it is known on DB2 for example is really missing in Oracle. OLTP Software Vendors would love to ship the optimized execution plans with their application. Imagine that all SQL statements in an OLTP application are parsed and optimized, compiled as bound procedures, similar to stored procedures, with procedural access (the execution plan) and you just have to call them. For reporting, DSS, BI you need the plans to adapt to the values and volume of data, but for OLTP you need stability. And from the application, you just call those static SQL like a data service.

Talking about procedural execution stored in the database, I’m coming to stored procedures and PL/SQL of course.

Stored Procedures

When you code in your 3GL language, do you have functions that update global variables (BASIC is the first language I learned and this was the idea) or do you define classes which encapsulate the function and the data definition? The revolution of Object Oriented concepts was to put data and logic at the same place. It’s better for code maintainability with direct dependency procedural code and data structures. It’s better for security because data is accessible only through provided methods. And it’s better for performance because procedural code access data at the same place.

Yes Object Oriented design rocks and this why you need to put business logic in the database. Putting the data on one component and running the code on another component of an information system is the worst you can do. Exactly as if in your Object Oriented application you store the object attributes on one node and run the methods on another one. And this is exactly what you do with the business logic outside of the database. Your DAO objects do not hold the data. The database does. Your objects can hold only a copy of the data, but the master copy where are managed concurrency management, high availability and persistance is in the database.

We will talk about the language later, this is only about the fact that the procedural code run in the same machine and the same processes than the data access.
There are a lot of myths about running business logic in the database. Most of them come from ignorance. Until last Monday I believed that one argument against running business logic in the database was unbeatable: You pay Oracle licences on the number of CPU, so you don’t want to use the database CPUs to run something that can run on a free server. I agreed with that without testing it, and this is where myths come from.

But Toon Koppelaars has tested it and he proved that you use more database CPU when you put the business logic outside of the database. I hope his presentation from Oak Table World 2016 will be available soon. He proved that by analyzing exactly what is running in the database, using linux perf and flame graphs: https://twitter.com/ChrisAntognini/status/778273744242352128

All those rountrips from remote compute server, all those row-by-row processing coming from that design have an higher footprint on the database CPUs that directly running the same on the database server.

PL/SQL

Running business logic on the database server can be done with any language. You can create stored procedures in Java. You can code external procedures in C. But those languages have not been designed for data manipulation. It is sufficient to call SQL statements but not when you need procedural data access. PL/SQL is a language made for data processing. It’s not only for stored procedure. But it’s the only language that is coupled with your data structure. As I said above, it’s better to think in sets with SQL. But it may be sometimes complex. With PL/SQL you have a procedural language which is intermediate between row-by-row and sets because it has some bulk processing capabilities.

In pseudo-code the Hello World above is something like that:

forall s in ["Hello World\n","Hello World\n","Hello World\n","Hello World\n","Hello World\n"] print(s)

It looks like a loop but it is not. The whole array is passed to the print() function and loop is done at lower level.

In PL/SQL you can also use pipeline functions where rows are processed with a procedural language but as a data stream (like SQL does) rather than loops and calls.

I’ll go to other advantages of PL/SQL stored procedures but here again there is one reason frequently raised to refuse PL/SQL. You can find more developers in .Net or Java than in PL/SQL. And because they are rare, they are more expensive. But there is a counter argument I heard this week at Oracle Open World (but I don’t remember who raised that point unfortunately). PL/SQL is easy to learn. Really easy. You have begin – exception – end blocks, you declare all variables, you can be modular with procedures and inline procedures, you separate signature and body, you have very good IDE, excellent debugger and easy profiler,… and you can run it on Oracle XE which is free. So, if you have a good Java developer he can write efficient PL/SQL in a few days. By good developer, I mean someone who understands multi-user concurrency problems, bulk processing, multi-threading, etc.

There are less PL/SQL developers than Java developers because you don’t use PL/SQL. It’s not the opposite. If you use PL/SQL you will find developers and there are many software vendors that code their application in PL/SQL. Of course PL/SQL is not free (except in Oracle XE) but it runs on all platforms and on all editions.

Continuous Integration and Deployment, dependency and versioning

I come back quickly to the advantages of using a language that is coupled with your data.

PL/SQL stored procedures are compiled and all dependencies are stored. With one query on DBA_DEPENDENCIES you can know which tables your procedure is using and which procedures use a specific table. If you change the data model, the procedures that have to be changed are immediately invalidated. I don’t know any other language that does that. You don’t want to break the continuous integration build every time you change something in a table structure? Then go to PL/SQL.

Let’s go beyond continuous integration. How do you manage database changes in continuous deployment? Do you know that with PL/SQL you can modify your data model online, with your application running and without breaking anything? I said above that procedures impacted by the change are invalidated and the must be adapted to be able to be compiled. But this is only for the new version. You can deploy a new version of those procedures while the previous version is running. You can test this new version and only when everything is ok you switch the application to the new version. The feature is called Edition Based Redefinition (EBR) it exists since 11g in all Oracle editions. It’s not known and used enough, but all people I know that use it are very happy with it.

In development environment and continuous integration, it is common to say that the database always cause problem. Yes it is true but it’s not inherent to the database but the data. Data is shared and durable and this is what makes it complex. The code can be deployed in different places, and can be re-deployed if lost. Data can be updated at only one place and visible to all users. Upgrading to a new version of application is easy: you stop the old version and start the new version. For data it is different: you cannot start from scratch and you must keep and upgrade the previous data.

Object-Relational impedance

I’m probably going too far in this blog post but the fact that data is shared and durable is the main reason why we cannot apply same concepts to data objects (business objects) and presentation objects (GUI components). Application objects are transient. When you restart the application, you create other objects. The identity of those objects is an address in memory: it’s different on other systems and it’s different once application is restarted. Business objects are different. When you manipulate a business entity, it must have the same identity for any users, and this identity do not change when application is restarted, not even when application is upgraded. All other points given as “object-relational impedance” are minor. But the sharing and durability of business object identity is the reason why you have to think differently.

Where to put business logic?

If you’re still there, you’ve probably understood that it makes sense to run the data logic in the database, with declarative SQL or procedural PL/SQL stored procedures, working in sets or at least in bulk, and with static SQL as much as possible, and versioned by EBR.

Where to put business logic then? Well, business logic is data logic for most of it. But you’ve always learned that business logic must be in the application tier. Rather than taking reasons given one by one and explain what’s wrong with them, let me tell you how came this idea of business logic outside of the database. The idea came from my generation: the awesome idea of client/server.

At first, data was processed on the servers and only the presentation layer was on the user side (for example ISAM was very similar with what we do with thin web pages). And this worked very well, but it was only green text terminals. Then came PCs and Windows 3.11 and we wanted graphical applications. So we built applications on our PCs. But that was so easy that we implemented all business logic there. Not because it’s a better architecture, but because anyone can build his application without asking to the sysops. This was heaven for developers and a nightmare for operations to deploy those applications on all the enterprise PCs.
But this is where offloading business logic started. Application written with nice IDEs (I did this with Borland Paradox and Delphi) connecting directly to the database with SQL. Because application was de-correlated from the database everything was possible. We even wanted to have applications agnostic of the database, running in any RDBMS. Using standard SQL and standard ODBC. Even better: full flexibility for the developer by using only one table with Entity-Value-Attribute.

Actually, the worst design anti-patterns have been invented at that time and we still see them in current applications – totally unscalable.

When finally the deployment of those client/server applications became a nightmare, and because internet was coming with http, html, java, etc. we went to 3-tier design. Unfortunately, the business logic remained offloaded in the application server instead of being part again of the database server.

I mentioned ODBC and it was another contributor to that confusion. ODBC looks like a logical separation of the application layer and the database layer. But that’s wrong. ODBC is not a protocol. ODBC is an API. ODBC do not offer a service: it is a driver running on both layers and that magically communicates through network: code to process data on one side and data begin on the other.

A real data service encapsulates many SQL statements and some procedural code. And it is exactly the purpose of stored procedures. This is how all data applications were designed before that client/server orgy and this is how they should be designed today when we focus on centralization and as micro-services applications.

So what?

This blog post is already too long. It comes from 20 years experience as developer, application DBA, and operation DBA. I decided to write this when coming back from the Oracle Open World where several people are still advocating for the right design, especially Toon Koppelaars about Thick Database at Oak Table World and the amazing panel about “Thinking clearly about application architecture” with Toon Koppelaars, Bryn Llewellyn, Gerald Venzl, Cary Millsap, Connor McDonald

The dream of every software architect should be to attend that panel w/ @ToonKoppelaars @BrynLite @GeraldVenzl @CaryMillsap @connor_mc_d pic.twitter.com/npLzpnktMK

— Franck Pachot (@FranckPachot) September 22, 2016

Beside the marketing stuff, I was really impressed by the technical content around the Oracle Database this year at OOW16.

Cet article Modern software architecture – what is a database? est apparu en premier sur Blog dbi services.

↧

Fun with PL/SQL code reviews – Part 1

October 6, 2016, 8:32 am

≫ Next: Oracle – Suspending/Resuming an Oracle PID with “oradebug suspend/resume” or the OS kill command

≪ Previous: Modern software architecture – what is a database?

For quite a long time I did not work anymore with PL/SQL and was quite happy when I had the chance to review some code at a customer. The status today: I am not that happy anymore Let me explain why and show you some examples on what was discovered. Of course all of the identifiers have been obfuscated and this is not to blame anyone. It is more to make people aware of what you should not do and that you have to be as exact as possible when you do programming. Even more important: Code that you write must be maintainable and understandable by others.

We start with a simple “if-then-else-end if” block:

         IF c <= d
         THEN
            IF c = d
            THEN
               l2 := l2 || l3;
            ELSE
               l2 := l2 || l3 || ',';
            END IF;
         END IF;

So what is wrong with this? The code does what it is supposed to do, true. But this is more complicated than it needs to be. Lets look at the first line:

         IF c <= d

If we enter this “IF” then we know that c is either less than d or equal to d, correct? On line 3 we know that c is d if we enter that “IF”. What does this imply for line 6 (the “ELSE”)?
It implies that c is less than d when we enter the “ELSE”. But then we could also write it like this:

            IF c = d
            THEN
               l2 := l2 || l3;
            ELSIF c < d THEN
               l2 := l2 || l3 || ',';
            END IF;

This is much more clear and less code. We are only interested if c is equal to d or less than d, that’s it. Then we should design the code exactly for that use case.

The next example is about the usage of a record. This is the code block:

         ...

         ll_col1 SCHEMA.TABLE.COLUMN1%TYPE;
         ll_col2 SCHEMA.TABLE.COLUMN2%TYPE;
         ll_col3 SCHEMA.TABLE.COLUMN3%TYPE;
         ll_col4 SCHEMA.TABLE.COLUMN4%TYPE;
         ...
         DECLARE
            TYPE MyRecord IS RECORD
            (
               l_col1    SCHEMA.TABLE.COLUMN1%TYPE,
               l_col2    SCHEMA.TABLE.COLUMN2%TYPE,
               l_col3    SCHEMA.TABLE.COLUMN3%TYPE,
               l_col4    SCHEMA.TABLE.COLUMN4%TYPE
            );
            rec1   MyRecord;
         BEGIN
            v_sql := 'SELECT col1, col2, col3, col4 FROM SCHEMA.TABLE WHERE col3 = '''
               || ll_col3
               || '''';
 
            EXECUTE IMMEDIATE v_sql INTO rec1;

            ll_col1 := rec1.l_col1;
            ll_col2 := rec1.l_col2;
            ll_col3 := rec1.l_col3;
            ll_col4 := rec1.l_col4;
         EXCEPTION
            WHEN NO_DATA_FOUND
            THEN
               ll_col3 := 0;
         END;

Beside that declaring a new block inside an existing block is really ugly what is wrong here? There is the definition of a new record which is then used to fetch the data from the dynamic sql statement. Nothing wrong here. But then the fields of this record are written back to four variables which are valid in and outside of that block, why that? Probably because the record is only valid in the “declare-begin-end” block. But for what do I need the record then at all? Really confusing

The next one is more about how easy it is to understand code written by someone else. This is the code block:

      SELECT COUNT (*)
        INTO l_count
        FROM USER_INDEXES
       WHERE INDEX_NAME = '' || l_index_name || '';

      IF l_count > 0
      THEN
         EXECUTE IMMEDIATE 'DROP INDEX ' || '' || l_index_name || '';
      END IF;

I really had to read this carefully until I understood why it was written this way. Probably the developer had this intension: “I want to know if a specific index exists and if yes, then I want to drop it”.
But what he did actually code is: I want to know if there are multiple indexes with the same name. As the query is against user_tables the answer can not be more than 1, correct? But then the correct and much more easy to understand way of writing this would be:

      SELECT COUNT (*)
        INTO l_count
        FROM USER_INDEXES
       WHERE INDEX_NAME = '' || l_index_name || '';

      IF l_count = 1
      THEN
         EXECUTE IMMEDIATE 'DROP INDEX ' || '' || l_index_name || '';
      END IF;

This is just a small change and does not affect the functionality at all but it is much more clear. Even more clear would be something like this:

      BEGIN
        SELECT 'index_exists'
          INTO l_exists
          FROM USER_INDEXES
         WHERE INDEX_NAME = '' || l_index_name || '';
      EXCEPTION
        WHEN NO_DATA_FOUND THEN l_exists := 'index_does_not_exist';
      END;
      IF l_exists = 'index_exists'
      THEN
         EXECUTE IMMEDIATE 'DROP INDEX ' || '' || l_index_name || '';
      END IF;

Using our language we can make the code much more easy to understand. Beside that the concatenation in the where clause and in the execute immediate statement is not required so the final code could look like this:

      BEGIN
        SELECT 'index_exists'
          INTO l_exists
          FROM USER_INDEXES
         WHERE INDEX_NAME = l_index_name;
      EXCEPTION
        WHEN NO_DATA_FOUND THEN l_exists := 'index_does_not_exist';
      END;
      IF l_exists = 'index_exists'
      THEN
         EXECUTE IMMEDIATE 'DROP INDEX ' || l_index_name;
      END IF;

In the next post we’ll look at some other examples which can be improved by changing the way we think about what we code and how others may interpret it.

Cet article Fun with PL/SQL code reviews – Part 1 est apparu en premier sur Blog dbi services.

↧