Monday, December 19, 2011

What are the options that are not available for global temporary tables ?

GTT  definitions are created in Data dictionary.  These table provide separate instance to each user who refers to them .

The following options are not available for global temporary tables:
- Any kind of constraints like check/referential cannot be applied on table
- Identity columns since data in GTT are materialized only for session
- Permanent Journaling cannot be done as data in tables are instances only to user in that session
- PPI cannot be applied as data does not get stored in PERM , only TEMP space is utilized here.

What are the options not available for volatile tables in teradata ?

The following options are not available for volatile tables because table definition is not stored in data dictionary
- Default values for columns 
- Title clause for columns
- Named Indexes for table
- Compression on columns/table level since table data are spooled
- Stats cannot be collected since data is materialized only for session 
- Identity columns as these again would need entry in IDcol tables
- PPI cannot be done on tables
- Any kind of constraints like check/referential cannot be applied on table

Tuesday, October 11, 2011

What are permanent journals in teradata?

- Journals are used to capture information about table in Teradata.  In case of Permanent journals they capture details of Journal enabled tables in teradata   with all the pre transaction and post transaction details . 
- Journal tables are assigned PERM space and they reside in same database as of parent or they can reside on different database.
- They are mainly used for protection of data and sometimes  also for disaster recovery ( fallback is better in this case )
- Permanent journal tables can be enabled or disabled by running alter database <databasename> 'no journal' /' journal = <databasename.jrnltbl>'
- Arcmain  utility provides the feature of backing  up Journal tables
- We can find details about all journal tables present in  teradata  database using DBC.JOURNALS table.

Tuesday, September 6, 2011

Query to find skew factor of a particular table ?


What is the acceptable range for skew factor in a table?

There is no particular range for skew factor.  In case of production systems, it is suggested to keep skew factor between 5-10. 
There are various considerations for skew factor
- Number of AMPS
- Size of row in a table
- number of records in a table
- PI of a table
- Frequent access of table (Performance consideration)
- whether table getting loaded daily /monthly or how frequently data is being refreshed

What is multivalued compression in Teradata?

Multivalued compression or just MVC is a compression technique applied on columns in Teradata .  MVC has a unique feature of compressing up-to 255 distinct values per column in a given table. 

The advantage of compression are
  • Reduced Storage cost by storing more of a logical data than physical data.
  • Performance is greatly improves due to  reduced retrieval of physical data for that column.   
  • Tables having compression always have an advantage since optimizer considers reduced I/O as one of the factors for generating EXPLAIN plan.

Sunday, July 24, 2011

What are the ways by which we can use zero to replace a null value for a given column ?

Answer -

1. By using Teradata SQL supported command as follows
Select Col1, ZEROIFNULL(Col2)  from Table_name;

2. By using ANSI SQL command as follows
a. Coalesce
Select Col1,COALESCE(Col2,0)    from Table_name;

b.Case operator
Case When Col2 IS NOT NULL
Then Col2
Else 0
from Table_name;

It is always suggested to use ANSI standard while coding in Teradata , since any changes in Teradata version due to upgrade/patches installation will lead to
- Time for regression testing
- rework of code.

Friday, July 8, 2011

What is a join index ? What are benefits of using Join index?


It is a index that is maintained in a system .It maintains rows  joined on two or more tables. Join index is useful for queries where the index structure contains all the columns referenced by one or more joins, thereby allowing the index to cover all or part of the query

Benefits if using join index is
- to eliminate base table access
- Aggregate processing is eliminated by creating aggregate join index
- It reduces redistribution of data since data is materialized by JI.
- Reduces complex join conditions by using covering queries

Can a macro be called inside a macro?

The main purpose of run a set of repeated sql queries.   Macro supports only DML queries .
Hence We cant call any-other macro or not even a procedure in a macro.

One trick to have closest functionality of this is to copy all the sql queries from macro2 to macro1 and add parameters if it is necessary as shown below.

replace macro1( val int)
(  sel * from employee_table where empid= :val );

replace macro2( dept_no int)
(  sel * from employee_table where deptno= :dept_no );

so, to call a macro2 inside a is not possible. Hence follow this approach

Replace macro1 (val int, dept_no int)
sel * from employee_table where empid= :val;
sel * from employee_table where deptno= :dept_no ;

Sunday, June 26, 2011

How do you find the list of employees named "john" in an employee table without using Like operator??

This question seems tricky.. but yes there is another way by which we can find names/patters without using like operator.
By using "BETWEEN" , we can find the lsit of employees named john...
sel * from employee where name betwee 'J' and 'K';

But at times usage of between is tricky, if there are other employees starting with J, those employees will also be listed by this query.

how do you list all the objects available in given database?

1.)select * from dbc.tables where databasename='<DATABASENAME>';

2.) By running a normal help command on that database as follows. 
help database <DATABASENAME';

Sunday, June 12, 2011

What are different types of Spaces available in Teradata ?

There are 3 types of Spaces available in teradata ,they are

1. Perm space
-This is disk space used for storing user data rows in any tables located on the database.
-Both Users & databases can be given perm space.
-This Space is not pre-allocated , it is used up when the  data rows are stored on disk.

2.Spool Space
-It is a  temporary workspace which is used for processing Rows for given SQL statements.
-Spool space is assigned only to users . -
-Once the SQL processing is complete the spool is freed and given to some other query.
-Unused Perm space is automatically available for Spool . 

3. TEMP space
-It is allocated to any databases/users where Global temporary tables are created and data is stored in them.
-Unused perm space is available for TEMP space

Thursday, June 9, 2011

What is hash collision ?

This occurs when there is same hash value generated for two different Primary Index Values. It is a rare occurance and Has been taken care in future versions of TD.

What is RAID, What are the types of RAID?

Redundant Array of Inexpensive Disks (RAID)  is a type of protection available in Teradata. RAID  provides Data protection at the disk Drive level.  It ensures data is available even when the disk drive had failed.

Th\ere are around 6 levels of RAID ( RAID0 to RAID5) .  Teradata supports Two levels of RAID protection
RAID 1 - Mirrored copy of data in other disk
RAID 5 - Parity bit (XOR) based Data protection on each disk array.

One of the major overhead's of RAID is Space consumption

Sunday, June 5, 2011

How do you find the No of AMP 's in the teradata Database?

1.) You can do a SELECT HASHAMP()+1 to get the total number of Amps in the  given teradata system.
2.) Details about amps can also be checked in Configuration management on  teradata PMON tool.

How do you see a DDL for an existing table in Teradata?

By using show table command  as follows
show table tablename ;

This will display DDL structure of table along with following details
Before/after journal
set/multiset table type
details about column - datatype ,default,identity/primary -foreign key .

How to find duplicates in a table?

To find duplicates in the table , we can use group by function on those columns which are to be used and then listing them if their count is >1 .

Following sample query can be used to find duplicates in table having  3 columns
select col1, col2,col3, count(*) from table
group by col1, col2, col3 
having count  (*) > 1 ;

Saturday, June 4, 2011

what are different types of journals in teradata?

There are  3 different types of journals available in Teradata. They are

1. Transient Journal  - This maintains current transaction history. Once the query is successful it deletes entries from its table .  If the current query transaction fails, It rolls back data from its table.

2. Permanent Journal  - This is defined when a table is created.  It can store BEFORE or AFTER image of tables. DUAL copies can also be stored. Permanent Journal maintains Complete history of table.

3.Down AMP recovery Journal (DARJ)  - This journal activates when the AMP which was supposed to process goes down.  This journal will store all entries for AMP which went down.  Once that AMP is up, DARJ copies all entries to that AMP and makes that AMP is sync with current environment.

What are the reasons for product joins ?

1. Stale or no stats causing optimizer to use product join
2. Improper usage of aliases in the query.
3. missing where clause ( or Cartesian product join  1=1 )
4. non equality conditions like > ,< , between   example ( date)
5. few join conditions
6.  when or conditions are used.

last but not the least   product joins are not bad always!! sometimes PJ are better compared to other types of joins.

Thursday, June 2, 2011

How to rename columns using views?

Create a view which is a subset of the employee table as follows.
Renaming columns in views will help increase security of sensitive tables  and hiding columns under alias names

Replace  view EmployeeV( number,fullname,addr,phno,depnum,sal, expr) as
locking row for access 
EmpNo ,                       
Name  ,                       
Phone   ,                     
DeptNo   ,                    
Salary    ,                   
from employee;

When we run sel * form employeeV, Only alias columns are displayed .

What are Restrictions on Views in Teradata?

An index cannot be Created on a view.
It cannot contain an ORDER BY clause.
All the derived columns and aggregate columns used in the view  must have an AS clause (alias defined).
A view cannot be used to UPDATE if it contains the following :
        – Data from more than one table (JOIN VIEW)
        – The same column twice
        – Derived columns
        – A DISTINCT clause
        – A GROUP BY clause

Sunday, May 29, 2011

how do you you implement Multi valued compression in an existing table?

MVC can be implemented on following conditions
1. A new column with multi valued compression can be added to an existing table, but cannot modify existing compressed column.
2. Create a new table with column which has MVC and do insert .. select from original table
  CREATE TABLE... as with column designated MVC.

Friday, May 27, 2011

list Built-in functions used in teradata ?

The main functionality of built in functions is  that they dont need any arguments or paramaters and can be used directly with select to return system values.
Some of them are

• SESSION: – Returns a number for the session for current user .
• TIME: – this function provides the current time based on a 24-hour day
• USER: –  This one gives the user name of the current user.
• ACCOUNT: – display's your Teradata Account information
• CURRENT_DATE: – Returns the current system date
• CURRENT_TIME: - This function returns the current system time and current session ‘Time Zone’ displacement.
• CURRENT_TIMESTAMP: - Returns the current system timestamp  with TimeZone
• DATABASE: –  It  returns the name of the default database for the current user.
• DATE: – same as Current_DATE   and is teradata built in .

Thursday, May 26, 2011

How do you whether table is locked or not?

Just run the following query on the table.

Lock Table DBNAME.TABLENAME write nowait

If this table is locked currently then , then the above statement would fail as Nowait will not wait for locks to be released on that  table .

Wednesday, May 18, 2011

What are advantages of compression on tables?

- They take less physical space then uncompressed columns hence reducing space cost
- They improve  system performance as less data will be retrieved per row fetched , more data is fetched per data block thus increasing data loading speed
- They reduce overall I/O

Sunday, February 20, 2011

How many error tables are there in fload and Mload and what is their significance/use?

How many error tables are there in fload and Mload and what is their significance/use?


Fload uses 2 error tables
ET TABLE 1: where format of data is not correct.
ET TABLE 2: violations of UPI
It maintains only error field name, error code and data parcel only.

Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table
1. ET TABLE - Data error
MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task.

2. UV TABLE - UPI violations
MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task

Apart from error tables, it also has work and log tables
Mload loads the selected records in the work table

A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.

How do you find out number of AMP's in the Given system

How do you find out number of AMP's in the Given system


1.running following query in queryman
Select HASHAMP () +1;

2. We can find out complete configuration details of nodes and amps in configuration screen of Performance monitor

Tuesday, February 15, 2011

What are the difference types of temporary tables in Teradata?

What are the difference types of temporary tables in Teradata?

a.       Global temporary tables
b.       Volatile temporary tables
c.       Derived tables

Global Temporary tables (GTT) –
1. When they are created, its definition goes into Data Dictionary.
2. When materialized data goes in temp space.
3. That's why, data is active up to the session ends, and definition will remain there up-to its not dropped using Drop table statement. If dropped from some other session then its should be Drop table all;
4. You can collect stats on GTT.

Volatile Temporary tables (VTT) -
1. Local to a session (deleted automatically when the session terminates)
2. Table Definition is stored in System cache .A permanent table definition is stored in the DBC data dictionary database (DBC.Temptables) .
3. Data is stored in spool space.
4. That’s why; data and table definition both are active only up to session ends.
5. No collect stats for VTT.If you are using volatile table, you can not put the default values on column level (while creating table)
6. Created by the CREATE VOLATILE TABLE sql statement

Derived tables
1 Derived tables are local to an SQL query.
2 Not included in the DBC data dictionary database, the definition is kept in cache.
3 They are specified on a query level with an AS keyword in an sql statement

List types of HASH functions used in Teradata?

List types of HASH functions used in Teradata?


The SQL hash functions are:

    * HASHROW (column(s))
    * HASHBUCKET (hashrow)
    * HASHAMP (hashbucket)
    * HASHBAKAMP (hashbucket)

            HASHROW ('Teradata')   AS "Hash Value"
            , HASHBUCKET (HASHROW ('Teradata')) AS "Bucket Num"
            , HASHAMP (HASHBUCKET (HASHROW ('Teradata'))) AS "AMP Num"
            , HASHBAKAMP (HASHBUCKET (HASHROW ('Teradata')))  AS "AMP Fallback Num" ;

This is really good, by looking into the result set of above written query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution

What are the advantages and dis-advantages of secondary Indexes?

What are the advantages and dis-advantages of secondary Indexes?

1. A secondary index might be created and dropped dynamically
2.A table may have up to 32 secondary indexes.
3. Secondary index can be created on any column. .Either Unique or Non-Unique
4. It is used as alternate path or Least frequently used cases.  ex. defining SI on non indexed column can improve the performance, if it is used in  join or filter condition of a given query.
5. Collecting Statistics on SI columns make sure Optimizer choses SI if it is better than doing Full Table Scans

1. Since Sub tables are to be created, there is always an overhead for additional spaces.
2. They require additional I/Os to maintain their sub tables.
3. The Optimizer may, or may not, use a NUSI, depending on its selectivity.
4. If the base table is Fallback, the secondary index sub table is Fallback as well.
5. If statistics are not collected accordingly, then the optimizer would go for Full Table Scan.

Where does TD store transient journal?

Where does TD store transient journal?

In perm space -> dbc.transientjournal
But that special table can grow over dbc's perm limit until the whole system runs out of perm space.

How to select first N Records in Teradata?

How to select first N Records in Teradata?

To select N records in Teradata you can use RANK function. Query syntax would be as follows

How to view every column and the columns contained in indexes in Teradata?

How to view every column and the columns contained in indexes in Teradata?

Following query describes each column in the Teradata RDBMS

Following query describes columns contained in indexes in the Teradata RDBMS

How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?

How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?

Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.
If it’s a duplicate it silently skips it without throwing any error.

What is a clique?

What is a clique?

A clique is a set of Teradata nodes that share a common set of disk arrays which are connected in daisy chain network to each disk array controller.
Cliques provide data accessibility if a node fails for any reason,  Proms are distributed across all nodes in the system. Large multiple node systems will have clique mechanisms associated with them

What is the difference between MultiLoad & Fastload interns of Performance?

What is the difference between MultiLoad & Fastload interns of Performance?

If you want to load, empty table then you use the fastload, so it will very useful than the MultiLoad ,because fastload performs the loading of the data in 2phase and its no need a work table for loading the data .
So it is faster as well as it follows the below steps to load the data in the table
Phase1 - It moves all the records to the entire AMP first without any hashing
Phase2 - After giving end loading command, Amp will hashes the record and send it to the appropriate AMPS.

It does the loading in the 5 phases
Phase1 - It will get the import file and checks the script
Phase2 - It reads the record from the base table and store in the work table
Phase3 - In this acquisition phase it locks the table header
Phase4 - In the DML operation will done in the tables
Phase5 - In this table locks will be released and work tables will be dropped.

How does indexing improve query performance?

 How does indexing improve query performance?

Indexing is a way to physically reorganize the records to enable some frequently used queries to run faster.
The index can be used as a pointer to the large table. It helps to locate the required row quickly and then return it back to the user.
The frequently used queries need not hit a large table for data. They can get what they want from the index itself. - cover queries.

Index comes with the overhead of maintenance. Teradata maintains its index by itself. Each time an insert/update/delete is done on the table the indexes will also need to be updated and maintained.
Indexes cannot be accessed directly by users. Only the optimizer has access to the index.

What is error table? What is the use of error table?

What is error table? What is the use of error table?

The Error Table contains information concerning:

- Data conversion errors Constraint violations and other error conditions:

* Contains rows which failed to be manipulated due to constraint violations or Translation error
* Captures rows that contain duplicate Values for UPIs.
* It logs errors & exceptions that occurs during the apply phase.
* It logs errors that are occurs during the acquisition phase.

How to find out list of indexes in Teradata?

How to find out list of indexes in Teradata?

Nonpartitioned Primary
Partitioned Primary
join index
hash index
primary key
unique constraint
value ordered secondary
hash ordered ALL covering secondary
valued ordered ALL covering secondary
ordering column of a composite secondary index
Multi column statistics
Derived column partition statistics
field1 column of a join or hash index
field2 column of a join or hash index

databasename, tablename, columnname, indextype, indexnumber, indexname
databasename,  tablename, indexnumber;

When should the statistics be collected?

When should the statistics be collected?

Here are some excellent guidelines on when to collect statistics:
·       All Non-Unique indices
·       Non-index join columns
·       The Primary Index of small tables
·       Primary Index of a Join Index
·       Secondary Indices defined on any join index
·       Join index columns that frequently appear on any additional join index columns that frequently appear in WHERE search conditions
·       Columns that frequently appear in WHERE search conditions or in the WHERE clause of joins.

How to make sure Optimizer chooses NUSI over Full Table Scan (FTS)?

How to make sure Optimizer chooses NUSI over Full Table Scan (FTS)?

A optimizer would prefer FTS over NUSI, when there are no Statistics defined on NUSI columns.
It would prefer scanning for full table instead of going for Subtables in NUSI  since optimized does not have information about subtables of NUSI
It is always suggested to collect statistics whenever NUSI columns are defined on the table.
Verify whether index is being used by checking in Explain plan.

What are the basic rules that define how PI is defined in Teradata?

What are the basic rules that define how PI is defined in Teradata?

The following rules govern how Primary Indexes in a Teradata Database must be defined as well as how they function:

One Primary Index per table.
A Primary Index value can be unique or non-unique.
The Primary Index value can be NULL.
The Primary Index value can be modified.
The Primary Index of a populated table cannot be modified.
A Primary Index has a limit of 64 columns. 

What are the basic criteria to select Primary Index column for a given table?

What are the basic criteria to select Primary Index column for a given table?

A thumb rule of ADV demographics is followed.
Access Demographics
Identify index candidates that maximize one-AMP operations.
Columns most frequently used for access (Value and Join).

Distribution Demographics
Identify index candidates that optimize parallel processing.
Columns that provide good distribution.

Volatility Demographics
Identify index candidates with low maintenance I/O.

Saturday, February 12, 2011

What is explain in teradata?

What is explain in teradata?

The EXPLAIN facility is a teradata extension that provides you with an "ENGLISH" translation of the
steps chosen by the optimizer to execute an SQL statement. It may be used on any valid teradata database with a preface called "EXPLAIN".

The following is an example:-
EXPLAIN select last_name first_name FROM employees;

  1. The EXPLAIN parses the SQL statement but does not execute it.
  2. This provides the designer with an "execution strategy".
  3. The execution strategy provides what an optimizer does but not why it chooses them.
  4. The EXPLAIN facility is used to analyze all joins and complex queries.

What are the different return codes(severity errors) in Teradata utilities?

What are the different return codes(severity errors)  in Teradata utilities?

There are 3 basic return codes (severity errors) in teradata utilities.

0 - success
4 - Warning
8 - User error
12 - System error
16 - system error  

Please note that apart from this there are separate error codes for each of the error  returned from sql queries.

For more details on error codes and fixing them please refer  to General reference manual  available from teradata documentation section in the teradata site.

what are the different date formats available in Teradata system?

what are the  different date formats available in Teradata system?

There are two different date formats available ,  they are
 The Teradata default format is:

The ANSI display format is:

Sunday, January 30, 2011

How do you set default date setting in BTEQ?

How do you set default date setting in BTEQ?

There are two default date setting in BTEQ.  They have to be set after logging on to the session

set session dateform = ANSIDATE ;  /*format is yyyy-mm-dd */                
set session dateform = integerdate ; /* format is yy/mm/dd   -teradata date format */

What does DROP table command do?

What does DROP table table_name command do?

  • It deletes all data in table_name
  • Removes the definition from the data dictionary
  • Removes all explicit access rights on the table

Is Like comparison case-sensitive in Teradata?

Is Like comparison case-sensitive in Teradata?

LIKE operator is not case sensitive in Teradata session mode.
Consider the following example  
Select F_name from employee where F_name like '%JO%’;
The following query will pick values matching with 'JO' and 'jo' as well, since Teradata is not case-sensitive

To overcome this problem, a new function called "CASESPECIFIC" is used in TERADATA as follows
Select F_name from employee where F_name (CASESPECIFIC) like '%JO%’;

How do you submit bteq script (batch mode)?

How do you submit bteq script (batch mode)?

1. Start the BTEQ , by typing BTEQ
2. Enter the following command
            .run file = BTEQScript.btq

1. Bteq < BTEQScript.btq
BTEQScript.btq contains following
 .logon, dbc;
sel top 10 * from dbc.tables;

What is the command in BTEQ to check for session settings ?

What is the command in BTEQ to check for session settings ?

The BTEQ .SHOW CONTROL command displays BTEQ settings.

The .SHOW CONTROL command shows all BTEQ session parameters
 [SET] FOLDLINE          = OFF ALL                                              
 [SET] FOOTING           = NULL                                                 
 [SET] FORMAT            = OFF                                                  
 [SET] FORMCHAR          = OFF                                                  
 [SET] HEADING           = NULL                                                 
 [SET] INDICDATA         = OFF                                                  
 [SET] NOTIFY            = OFF                                                  
 [SET] NULL              = ?                                                    
 [SET] OMIT              = OFF ALL                                              
 [SET] PAGEBREAK         = OFF ALL                                              
 [SET] PAGELENGTH        = 55                                                   
 [SET] QUIET             = OFF                                                  
 [SET] RECORDMODE        = OFF                                                  
 [SET] RETCANCEL         = OFF                                                  
 [SET] RETLIMIT          = No Limit                                             
 [SET] RETRY             = ON                                                   
 [SET] RTITLE            = NULL                                                 
 [SET] SECURITY          = NONE                                                 
 [SET] SEPARATOR         = two blanks                                           
 [SET] SESSION CHARSET   = ASCII                                                
 [SET] SESSION SQLFLAG   = NONE                                                 
 [SET] SESSION TRANSACTION = BTET                                               
 [SET] SESSIONS          = 1                                                    
 [SET] SIDETITLES        = OFF for the normal report. 
 [SET] SKIPDOUBLE        = OFF ALL                                             
 [SET] SKIPLINE          = OFF ALL                                             
 [SET] SUPPRESS          = OFF ALL                                             
 [SET] TDP               = l5442                                               
 [SET] TITLEDASHES       = ON for the normal report.                           
       And, it is   ON for results of WITH clause number: 1 2 3 4 5 6 7 8 9.   
 [SET] UNDERLINE         = OFF ALL                                             
 [SET] WIDTH             = 75