Sunday, January 30, 2011

How do you set default date setting in BTEQ?


How do you set default date setting in BTEQ?
Answer

There are two default date setting in BTEQ.  They have to be set after logging on to the session

set session dateform = ANSIDATE ;  /*format is yyyy-mm-dd */                
set session dateform = integerdate ; /* format is yy/mm/dd   -teradata date format */

What does DROP table command do?

What does DROP table table_name command do?

  • It deletes all data in table_name
  • Removes the definition from the data dictionary
  • Removes all explicit access rights on the table

Is Like comparison case-sensitive in Teradata?


Is Like comparison case-sensitive in Teradata?
Answer

LIKE operator is not case sensitive in Teradata session mode.
Consider the following example  
Select F_name from employee where F_name like '%JO%’;
The following query will pick values matching with 'JO' and 'jo' as well, since Teradata is not case-sensitive

To overcome this problem, a new function called "CASESPECIFIC" is used in TERADATA as follows
Select F_name from employee where F_name (CASESPECIFIC) like '%JO%’;

How do you submit bteq script (batch mode)?


How do you submit bteq script (batch mode)?

Answer
1. Start the BTEQ , by typing BTEQ
2. Enter the following command
            .run file = BTEQScript.btq

OR
1. Bteq < BTEQScript.btq
BTEQScript.btq contains following
 .logon 127.0.0.1/dbc, dbc;
sel top 10 * from dbc.tables;
.quit

What is the command in BTEQ to check for session settings ?

What is the command in BTEQ to check for session settings ?

Answer
The BTEQ .SHOW CONTROL command displays BTEQ settings.



The .SHOW CONTROL command shows all BTEQ session parameters
Example
 [SET] FOLDLINE          = OFF ALL                                              
 [SET] FOOTING           = NULL                                                 
 [SET] FORMAT            = OFF                                                  
 [SET] FORMCHAR          = OFF                                                  
 [SET] HEADING           = NULL                                                 
 [SET] INDICDATA         = OFF                                                  
 [SET] NOTIFY            = OFF                                                  
 [SET] NULL              = ?                                                    
 [SET] OMIT              = OFF ALL                                              
 [SET] PAGEBREAK         = OFF ALL                                              
 [SET] PAGELENGTH        = 55                                                   
 [SET] QUIET             = OFF                                                  
 [SET] RECORDMODE        = OFF                                                  
 [SET] RETCANCEL         = OFF                                                  
 [SET] RETLIMIT          = No Limit                                             
 [SET] RETRY             = ON                                                   
 [SET] RTITLE            = NULL                                                 
 [SET] SECURITY          = NONE                                                 
 [SET] SEPARATOR         = two blanks                                           
 [SET] SESSION CHARSET   = ASCII                                                
 [SET] SESSION SQLFLAG   = NONE                                                 
 [SET] SESSION TRANSACTION = BTET                                               
 [SET] SESSIONS          = 1                                                    
 [SET] SIDETITLES        = OFF for the normal report. 
 [SET] SKIPDOUBLE        = OFF ALL                                             
 [SET] SKIPLINE          = OFF ALL                                             
 [SET] SUPPRESS          = OFF ALL                                             
 [SET] TDP               = l5442                                               
 [SET] TITLEDASHES       = ON for the normal report.                           
       And, it is   ON for results of WITH clause number: 1 2 3 4 5 6 7 8 9.   
 [SET] UNDERLINE         = OFF ALL                                             
 [SET] WIDTH             = 75

What are benefits of Permanent Journal?

What are the benefits of Permanent Journal?

The benefits of Permanent Journal are
  • Permits capture of before images for database rollback.
  • Permits capture of after images for database roll forward.
  • Permits archiving change images during table maintenance.
  • Reduces need for full-table backups.
  • Provides a means of recovering NO FALLBACK tables.
  • Requires additional disk space for change images.
  • Requires user intervention for archive and recovery activity

What are the benefits of fallback?


What are the benefits of fallback?

The benefits of fallback are
  • Protects your data from hardware (disk) failure.
  • Protects your data from software (node) failure.
  • Automatically recovers with minimum recovery time, after repairs or fixes are complete

What’s the difference between TIMESTAMP (0) and TIMESTAMP (6)?


What’s the difference between TIMESTAMP (0) and TIMESTAMP (6)?

TIMESTAMP (0) is CHAR (19) and TIMESTAMP (6) is CHAR (26)
Timestamp(0) is YYYY-MM-DDbHH:MI:SS
Timestamp(6) is YYYY-MM-DDbHH:MI:SS.ssssss  ( milliseconds extra)

What is a Dirty-Read or Stale-Read Lock?


What is a Dirty-Read or Stale-Read Lock?

This occurs when a access lock is applied on the table which is doing a update.
May produce erroneous results if performed during table maintenance resulting in Dirty Read or stale read , which might  result in inconsistent result set.

Difference between BTEQ and Sql assistant (query man)?


Differences between BTEQ and Sql assistant (query man)?

BTEQ
  • Basic Teradata Query utility
  • SQL front-end
  • Report writing and formatting features
  • Interactive and batch queries
  • Import/Export across all platforms
  • The default number of sessions, upon login, is 1.
 
Teradata Query Manager / Queryman / TeradataSQL Assistant
  • SQL front-end for ODBC compliant databases
  • Historical record of queries including:
-         Timings
-         Status
-         Row counts
  • Random sampling feature
  • Limit amount of data returned
  • Import/Export between database and PC
  • Export to EXCEL or ACCESS

Sunday, January 16, 2011

How do you execute the given SQL statement repeatedly in BTEQ?

How do you execute the given SQL statement repeatedly in BTEQ?
Answer:

Select top 1* from database.table1;
=n
Here "=n” is to run the previous sql statement, "n" number of times.

What are types of PARTITION PRIMARY INDEX (PPI) in Teradata?

What are types of PARTITION PRIMARY INDEX (PPI) in Teradata?
Answer:


1. Partition by CASE
CREATE      TABLE ORDER_Table
(
ORD_number  integer NOT NULL,
customer_number integer NOT NULL,
order_date  date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY case1 (
            order_total < 10000 ,
            order_total < 20000 ,
            order_total < 30000,
NO           CASE     OR        UNKNOWN ) ;

2. Partition by Range  - example using date range
CREATE TABLE ORDER_Table
(
ORD_number  integer NOT NULL,
customer_number integer NOT NULL,
order_date  date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY range1 (
            Order_date BETWEEN date '2010-01-01'       AND      date '2010-12-01'
            EACH interval '1' month ,
NO RANGE
        OR  UNKNOWN);

P.S:If  we use NO RANGE or NO CASE  - then all values not in this range will be in a single partition.
If we specify UNKNOWN, then all null values will be placed in this partition

Can we define PARTITION PRIMARY INDEX (PPI) on a Primary Index column in Table? Explain Why?

Can we define PARTITION PRIMARY INDEX (PPI) on a Primary Index column in Table? Explain Why?
Answer:

PPI cannot be defined on PI column in Table.  Since PI is used to locate an AMP and store the data based on HASH value (ROW ID ) it cannot be used as PPI column.
In case of PPI , The data stored in AMP's are Partitioned based on PPI column after they are row hashed  (ROW KEY = ROW ID +PPI value )

P.S: If you want to create UPI on table, then PPI column can be added as part of PI .

How to skip 1st record while using Bteq IMPORT?

How to skip 1st record while using Bteq IMPORT?
Answer:

By using SKIP=1 ;   , we can skip first record.
 .import infile=<filename>, skip=1;

What is TENACITY? What is its default value?

What is TENACITY? What is its default value?
Answer

TENACITY specifies the amount of time in hours, to retry to obtain a loader slot or to establish all requested sessions to logon. The default for Fast Load is “no tenacity”, meaning that it will not retry at all. If several FastLoad jobs are executed at the same time, we recommend setting the TENACITY to 4, meaning that the system will continue trying to logon for the number of sessions requested for up to four hours.

What does SLEEP function does in Fast load?

What does SLEEP function does in Fast load?
Answer

The SLEEP command specifies the amount minutes to wait before retrying to logon and establish all sessions.
Sleep command can be used with all load utilities not only fastload.
This situation can occur if all of the loader slots are used or if the number of requested sessions is not available. The default value is 6 minutes. If tenacity was set to 2 hours and sleep 10 minutes, Sleep command will try to logon for every 10 minutes up to 2 hours duration.

What is Cross Join?

What is Cross Join?
Answer:

It is a Teradata specified Join, which is used as equivalent to product join.
There is no “On” clause in case of CROSS join
SELECT  EMP.ename , DPT.Dname
FROM     employee EMP
CROSS JOIN
Department DPT
WHERE
EMp.deptno = DPT.depto ;

How many types of Index are present in teradata?

How many types of Index are present in teradata?
Answer:

There are 5 different indices present in Teradata
1. Primary Index
      a.Unique primary index
      b. non Unique primary index
2. Secondary Index
      a. Unique Secondary index
      b. non Unique Secondary index
3. Partitioned Primary Index
      a. Case partition (ex. age, salary...)
      b. range partition ( ex. date)
4. Join index
      a. Single table join index
      b. multiple table join index
      c. Sparse Join index ( constraint applied on join index in where clause)
5. Hash index

Difference between Stored Procedure and Macro?

Difference between Stored Procedure and Macro?
Answer:

SP:
It does not return rows to the user.
It has to use cursors to fetch multiple rows
It used Inout/Out to send values to user
It Contains comprehensive SPL
It is stored in DATABASE or USER PERM
A stored procedure also provides output/Input capabilities

Macros:
It returns set of rows to the user.
It is stored in DBC SPOOL space
A macro that allows only input values

What is a Sparse Index?

What is a Sparse Index?
Answer:

Sparse Join Indexes are a type of Join Index which contains a WHERE clause that reduces the number of rows which would otherwise be included in the index. All types of join indexes, including single table, multitable, simple or aggregate can be sparse.

How to handle nulls in Teradata??? How many columns can be there in a table???

How to handle nulls in Teradata???
How many columns can be there in a table???
Answer

1. Use zeroifnull, nullifzero in select and NULL in insert directly.
2. 256 columns max per table.

How to find average sal with out using avg function????

How to find average sal with out using avg function????
Answer

Without using "avg" we can find the avg salary by using sum (sal)/count (sal);
sel sum(sal)/count(sal) as avgsal from tablename

What is difference B/w User and database in Teradata?

What is difference B/w User and database in Teradata?
Answer:

- User is a database with password but database cannot have password
- Both can contain Tables , views and macros
- Both users and databases may or may not hold privileges
- Only users can login, establish a session with Teradata database and they can submit requests

How do you create materialized view in Teradata?

How do you create materialized view in Teradata?
Answer:

There is no such thing as a "materialized view" in Teradata. The equivalent in Teradata would be a join index (or aggregate index) with a corresponding view put on top of it. The command to create one is "CREATE JOIN INDEX...(lots of options)".

Join indices are maintained when the data is maintained. They cannot be accessed directly, but are invoked when the optimizer determines it is beneficial. Even though you cannot invoke a join index directly, you create a view that looks like the join index and in that way, it would be similar to having direct access. However, this approach does not guarantee that the join index will be used.

What are Differences between Teradata and ANSI Session modes in Teradata?

What are Differences between Teradata and ANSI Session modes in Teradata?
Answer:

TERADATA MODE
-------------
1. Comparison is not Case sensitive
2. Create table are default to SET tables
3. Each transaction is committed implicitly
4. Supports all Teradata commands
5. It follows BTET (Begin and End Transaction) Mode


ANSI MODE
-------------
1. Comparison is CASE sensitive
2. Create table are default to MULTISET tables
3. Each transaction has to be committed explicitly
4. Does not support all Teradata commands
5. It does not follow BTET Mode

What are the scenarios in which Full Table Scans occurs?

What are the scenarios in which Full Table Scans occurs?

1. The where clause in SELECT statement does not use either primary index or secondary index
2. SQL Statement which uses a partial value (like or not like), in the WHERE statement.
3. SQL Statement which does not contain where clause.
4. SQL statement using range in where clause. Ex. (col1 > 40 or col1 < =10000)

How to identify PPI columns?

How to identify PPI columns?
Answer:

Select databasename , tablename , columnposition ,columnname from dbc.indices
where indextype ='Q'
order by 1 ,2,3 ;

How to skip the header row in the fastload script

How to skip the header row in the fastload script

RECORD 2;   /* this skips first record in the source file */
DEFINE ...

Explain types of re-distribution of data happening for joining of columns from two tables

Explain types of re-distribution of data happening for joining of columns from two tables
Answer:

Case 1 - P.I = P.I joins
Case 2 - P.I = N.U.P.I joins
Case 3 - N.U.P.I = N.U.P.I joins

Case1 - there is no redistribution of data over amp's. Since amp local joins happen as data are present in same AMP and need not be re-distributed. These types of joins on unique primary index are very fast.
Case2 - data from second table will be re-distributed on all amps since joins are happening on PI vs. NUPI column.  Ideal scenario is when small table is redistributed to be joined with large table records on same amp
case3 - data from both the tables are redistributed on all AMPs.  This is one of the longest processing queries , Care should be taken to see that stats are collected on these columns

Can you load multiple data files for same target table using Fastload?

Can you load multiple data files for same target table using Fastload?
Answer:

Yes, we can Load a table using multiple datafiles in Fastload.

Before giving "end loading" statement user can define file path and use insert sql for multiple source files and give "end loading" statement at the end

Why does varchar occupy 2 extra bytes?

Why does varchar occupy 2 extra bytes?
Answer

The two bytes are for the number of bytes for the binary length of the field.
It stores the exact no of characters stored in varchar

How many types of Skew exist?

How many types of Skew exist?
Answer

If you utilized unequally TD resources (CPU,AMP,IO,Disk and etc) this is called skew exists. Major are 3 types of skews (CPU skew, AMP/Data skew, IO Skew).

-Data skew?
When data is not distributed equally on all the AMPs.
-Cpu skew?
Who is taking/consuming more CPU called cpu skew.
-IO skew?
Who perform more IO Operation? Resulting in IO Skew

Why Fload doesn’t support multiset table?

Why Fload doesn’t support multiset table?
Answer

Fload does not support Multiset table because of following reason.

Say, the fastload job fails. Till the fastload failed, some number of rows was sent to the AMP's.
Now if you restart FLOAD, it would start loading record from the  last checkpoint and some of the consecutive rows are sent for the second time. These will be caught as duplicate rows are found after sorting of data.
   This restart logic is the reason that Fastload will not load duplicate rows into a MULTISET table. It assumes they are duplicates because of this logic. Fastload support Multiset table but does not support the duplicate rows. Multiset tables are tables that allow duplicate rows. When Fastload finds the duplicate rows it discards it. Fast Load can load data into multiset table but will not load the duplicate rows.

What is Global Temporary table? What is the use of this Global Temporary table?

Can we take collect stats on Derived Tables and Volatile tables and Temporary tables?
What is Global Temporary table?
What is the use of this Global Temporary table?

Answer
1. No for volatile and derived tables and yes for global tables.
2. Global tables are temp tables like volatile tables but unlike volatile tables, their definitions are retained in dd.
3. It is used whenever there is a need for a temporary table with same table definition for all users.

What is the default join strategy in Teradata???

What is the default join strategy in Teradata???
Answer
# 1
The Teradata Cost based optimizer will decide the join strategy based on the optimum path. The common strategies followed are from Merge, Hash & Nested Join

Answer
# 2
Three strategies followed by optimizer are:
1. Duplication of rows of one table to every amp
--> This one is opted by optimizer when the non-PI column is on a small table.
2. Redistribution of the non PI rows of the table to the amp containing the matching PI row amp.
--> This one is opted when the tables are large. The non PI column is in one table is redistributed to the amp containing the matching PI.
3. Redistribute both the rows of the table by hash values.
--> This is chosen when the join is on a column that is not the PI in either table. This will require the most spool space.

What do High confidence, Low confidence and No confidence mean in EXPLAIN plan?

What do High confidence, Low confidence and No confidence mean in EXPLAIN plan?
Answer

Explain gives the execution strategy means what are the different steps that the query will go through.
HIGH CONFIDENCE:  Statistics are collected.
LOW CONFIDENCE:   Statistics are not collected.  But the where condition is having the condition on indexed column. Then estimations can be based on sampling.
NO CONFIDENCE: Statistics are not collected and the condition is on non indexed column.

How to Skip or Get first and Last Record from Flat File through MultiLoad?

How to Skip or Get first and Last Record from Flat File through MultiLoad?
Answer

In .IMPORT command in Mload we have a option to give record no. from which processing should begin. i.e. ‘FROM m’ ‘m’ is a logical record number, as an integer, of the record in the identified data source where processing is to begin. You can mention ’m’ as 2 and processing will  start from second record.

THRU k and FOR n are two options in the same Mload command, functions same towards the end of the processing.

Adding to the above, if from n"start record" and for n "stop record" are not mentioned, mload considers records from   start till the end of the file.

Which is faster – MultiLoad delete or Delete command?

Which is faster – MultiLoad delete or Delete command?
Answer

MultiLoad delete is faster then normal Delete command, since the deletion happens in data blocks of 64Kbytes, where as delete command deletes data row by row.  Transient journal maintains entries only for Delete command since Teradata utilities doesn’t support Transient journal loading.

What is Teradata Virtual storage?

What is Teradata Virtual storage?
Answer

This concept is introduced in TD12. It does the following tasks
- maintains information on frequency of data access
- tracks data storage task on physical media
- migrating frequently used data to fast disks and less frequently used data to slower disks
- allocating cyclinders from storage to individual AMPs

how to start / stop a database in windows?

how to start / stop a database in windows?
Answer

1. logon to CMD
2. check for state pdestate -d
3.  run the following command to start  "net start recond"
4. check for status  pdestate -d
5.  to STOP the database  - Trusted Parallel Application or TPA
tpareset -x comment
The -x option stops Teradata without stopping the OS.

What is a role?

What is a role?
Answer

A role is a set of access rights which can be assigned to the users.   They indirectly help in performance by reducing the number of rows entered in DBC.accessrights

what is a profile?

what is a profile?
Answer

A profile contains set of user parameters  like accounts, default database, spool
space, and temporary space for a group of users

To assign the profile to a user, use the AS PROFILE modified in a CREATE USER or
MODIFY USER statement:
MODIFY USER username AS PROFILE=profilename ;
To remove a profile from a member but retain the profile itself:
MODIFY USER username AS PROFILE=NULL ;

Why are AMPs and PEs called as vprocs ?

Why are   AMPs and PEs called as vprocs ?
Answer

AMPs and PEs are implemented as “virtual processors - vprocs”.
They run under the control of PDE and their number is software configurable.
AMPs are associated with “virtual disks – vdisks” which are associated with logical units (LUNs) within a disk array

Answer2:
Vprocs:Virtual process From PE to AMP (This is the network root via MSP(message passing layer),The processing data will store in Disks(These are Physical disks),Each Amp have too many P.disks,to migrate these P.disks The Bynet Network maintains Virtual disks.These V.disks will responsible for data migration.hence they are called as Virtual Process(VPROCS).

What is residual condition in explain plan ?

What is residual condition in explain plan ?
Answer

It is a condition which help u to reduce the number used for join condition. Residual condition does not help in locating a row

How to check if given object is a database or user ?

How to check if given object is a database  or user ?
Answer

To check whether the given object is user or database , we can use following query
sel * from dbc.databases where dbkind  ='U'  or dbkind='D';

Saturday, January 15, 2011

What are set tables and multiset tables in Teradata?Explain with an appropriate example?

What are set tables and multiset tables in Teradata?Explain with an appropriate example?
Answers:
1) Set tables cannot accept duplicate at row level not Index or key level.
Example of rows for set table:
R1 c2 c3 c4 ..cn
1 2 3 4 ... 9

Accepted
1 2 3 4 ... 9
Duplicate is Rejected
2 1 2 4 ... 9
3 2 4 4 ... 9
4 3 4 4 ... 9

2) Multi set Tables can accept
duplicate at row level not Index or key level.Exmaple of rows for multi set table:
R1 c2 c3 c4 .. cn
1 2 3 4 ... 9
1 2 3 4 ... 9
Duplicate is Accepted
2 1 2 4 ...9
3 2 4 4 ...9
3 2 4 4 ...9
Duplicate is Accepted

In a table can we use primary key in one column and in another column both unique and not null constrains.if yes how?

In a table can we use primary key in one column and in another column both unique and not null constrains.if yes how?
Answers:

Yes, you can have a column for Primary key and have another column which will have no duplicates or null.e.g.A Salary Table will have employee ID as primary key.
The table also contains TAX-ID which can not be null or duplicate

How many codd's rules are satisfied by teradata database?

How many codd's rules are satisfied by teradata database?
Answers:
12 codd’s rules.

Explain about Skew Factor?

Explain about Skew Factor?
Answers:

The data distribution of table among AMPs is called Skew Factor . Generally For Non-Unique PI we get duplicate values so the more duplicate vales we get more the data have same row hash so all the same data will come to same amp, it makes data distribution inequality,One amp will store more data and other amp stores less amount of data, when we are accessing full table,
The amp which is having more data will take longer time and makes other amps waiting which leads processing wastage In this situation (unequal distribution of data)we get Skew Factor HighFor this type of tables we should avoid full table scans.
ex:AMP0 AMP110000(10%) 9000000(90%)in this situation skew factor is very high 90%

What is use of compress in teradata?Explain?

What is use of compress in teradata?Explain?
Answers:

Compression is used to Minimize the table size, for example when the table size is increasing anonymously We can use Compression to reduce the size of the table
Conditions:1.Compression can be declared at the time of table creation2.We can compress up to 256 column values(not columns) 3.We can't compress variable length fields (vartext,varchar..)

Explanation
for 2 condition:create table tab1(::Order_type char(25) compress ('air','sea','road'):)in the above example order type have 3 fields, one should be selected by the user, so one of the field will repeat for every order, like these column values we can use compress statement because these are repeating for
entire table, like these column values TD supports 256 col generally NCR people will provides ready made scripts for these type of compressions However, we can store only one value per column and not 3(air, sea, road). The compressed value is
stored in column header and is to be used as default of that column unless a value is present.
e.g.
Dept of Taxes in Washington has a database that tracks all people working in Washington. Around 99.9% of the tax payers would have Washington as their state code on their address. Instead of storing “Washington” in millions of records the compress will store the value “Washington” in the table header. Now, a value can be assumed in a row as a default unless another value exists inside the column

What is the process to restart the multiload if it fails?

What is the process to restart the multiload if it fails?
Answers:

MULTILOAD will creates 2 error tables, 1 work table When MULTILOAD fails We have to unlock the Main Table, here error tables and work tables are not locked like FLOAD.

To Unlock the Main Table in  case of acquisation Phase :
Mload RELEASE MLOAD ;
To release lock in application phase failure :
RELEASE MLOAD .IN APPLY;

Can we load a Multi set table using MLOAD?

Can we load a Multi set table using MLOAD?
Answers:

We can Load SET, MULTISET tables using Mload, But here when loading into MULTISET table using MLOAD duplicate rows will not be rejected, we have to take care of them before loading.But in case of Fload when we are loading into MULTISET duplicate rows are automatically rejected, FLOAD will not load duplicate rows weather table is SET or MULTISET

Can I use “drop” statement in the utility “fload”?

Can I use “drop” statement in the utility “fload”?
Answers:

YES,But you have to declare it out of the FLOAD Block it means it should not come between .begin loading,.end loading FLOAD also supports DELETE,CREATE,DROP statements which we have to declare out of FLOAD blocking the FLOAD Block we can give only INSERT

Is it possible that there r two primary key will be in a same table?

Is it possible that there r two primary key will be in a same table?
Answers:

Primary key
==========
1. A table should have only one primary key
2. More than one column can consist of a primary key – upto 64 columns
3. Can not be NULL values (missing values)
4. Should be unique values (no duplicate value)

Foreign key
==========
1. A table may have zero or more than that up-to 32 keys
2. More than one column can consist a primary key – up to 64 columns
3. Can have NULL values
4. Can have duplicate values
5. There should be a corresponding primary key in the parent table to enforce referential integrity for a foreign key

What are the types of tables in Teradata ?

What are the types of tables in Teradata ?
Answers:

1. set table
2. multiset table
3. derived table
4. global temporary table(temporary table)
5. volatile table

How a Referential integrity is handled in Teradata?

How a Referential integrity is handled in Teradata?
Answers:

By use of  TPUMP utility,  referential integrity is handled in teradata

Join strategies?

Join Strategies There are 2 tables, table A with 10 million records, table B has 100 million records, now we are joining both tables, when we seen Explain Plan the plan showing TD will took the table A and it will redistributes itNow the Question is: By that plan is the optimizer is correct job or not ? Justify Ans2. From the same above example now the optimizer is taking Table B (100 million records) and it is distributing it, Now is the optimizer is doing best? and How you avoid this situation
Answers:

Teradata is smart enough to decide when to redistribute and when to copy.
It compares the tables. Are they comparable? or one is big as compared to the other?Based on simple logic it decides whether to distribute the smaller table on all the AMPs or to copy.
what I mean is the small table is copied into all the AMPs in the SPOOL space.Remember all always the Join's has to take place on the AMPs SPOOL Space.By redistributing it is making sure that the 100 million rows table gets the feeling that
it is making AMP local JOIN. Remember the basic thing what ever Teradata does.
It does keeping in consideration for Space and Performance and not to forget the Efficiency.

My simple formula:
If the table is small redistribute them to all the AMPs to have the AMP local Join.
Always JOINs are made AMP local if it cannot then you have the high chance of running out of SPOOL space.

What is the difference between start schema and Fsldm?

What is the difference between start schema and Fsldm?
Answers:
FSLDM --> Financial Services Logical Data Model (Teradata FS-LDM 7.0) its developed by Teradata for the financial sectors (Specifically for Bank). it can be customised based on the user requirement.

StarSchema --> Its a relational database schema for representing multidimensional data. The data is stored in a central fact table, with one or more tables holding information on each dimension.

What is the difference between Global temporary tables and Volatile temporary tables?

What is the difference between Global temporary tables and Volatile temporary tables?
Answers:

Global Temporary tables (GTT) –
1. When they are created, its definition goes into Data Dictionary.
2. When materialized data goes in temp space.
3. That's why, data is active upto the session ends, and definition will remain there up-to its not dropped using Drop table statement.If dropped from some other session then its should be Drop table all;
4. you can collect stats on GTT.

Volatile Temporary tables (VTT) -
1. Table Definition is stored in System cache
2. Data is stored in spool space.
3. thats why, data and table definition both are active only upto session ends.
4. No collect stats for VTT.If you are using volatile table, you can not put the default values on column level ( while creating table )

Teradata performance tuning and optimization

Teradata performance tuning and optimization
Answers:

1. collecting statistics
2. Explain Statements
3. Avoid Product Joins when possible
4. select appropriate primary index to avoid skewness in storage
5. Avoid Redistribution when possible
6. Use sub-selects instead of big "IN" lists
7. Use derived tables
8. Use GROUP BY instead of DISTINCT ( GROUP BY sorts the data locally on the VPROC. DISTINCT sorts the data after it is redistributed)
9. Use Compression on large tables

How do you transfer large amount of data in Teradata?

How do you transfer large amount of data in Teradata?
Answers:

Transferring of large amount of data can be done using various Application Teradata Utilities which resides on the host computer ( Mainframe or Workstation) i.e. BTEQ, FastLaod, MultiLoad, Tpump and FastExport.

* BTEQ (Basic Teradata Query) supports all 4 DMLs: SELECT, INSERT, UPDATE and DELETE. BTEQ also support IMPORT/EXPORT protocols.
* Fastload, MultiLoad and Tpump transfer the data from Host to Teradata.
* FastExport is used to export data from Teradata to the Host.

How can you use HASH FUNCTIONS to view Data Distribution across all AMPs in Teradata?

How can you use HASH FUNCTIONS to view Data Distribution across all AMPs in Teradata?
Answers:

Hash Functions can be used to view the data distribution of rows for a chosen primary index.

SELECT HASHAMP(HASHBUCKET(HASHROW())) AS “AMP#”, COUNT(*) FROM GROUP BY 1 ORDER BY 2 DESC;

HASHROW – returns the row hash value for a given value
HASHBUCKET – the grouping of a specific hash value
HASHAMP – the AMP that is associated with the hash bucket

This is really good, by looking into the result set of above written query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution.

How can you track Login Parameters of users in Teradata?

How can you track Login Parameters of users in Teradata?
Answers:

You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF
SELECT LOGDATE, LOGTIME, USERNAME, EVENT FROM DBC.LOGONOFF;

How can you find the Teradata Release and Version information from Data Dictionary Table?

How can you find the Teradata Release and Version information from Data Dictionary Table?
Answers:

To find Release and Version information you can query this Data Dictionary table DBC.DBCINFO

SELECT * FROM DBC.DBCINFO;

How can you find the Table Space Size of your table across all AMPs?

How can you find the Table Space Size of your table across all AMPs?
Answers:

You can find the Table Space Size of your table from this Data Dictionary Table DBC.TABLESIZE

SELECT DATABASENAME, TABLENAME, SUM(CURRENTPERM) FROM DBC.TABLESIZE WHERE DATABASENAME = ‘’ AND TABLENAME = ‘’ GROUP BY DATABASENAME , TABLENAME;

How can you determine I/O and CPU usage at a user level in Teradata?

How can you determine I/O and CPU usage at a user level in Teradata?
Answers:

You can find out I/O and CPU Usage from this Data Dictionary Table DBC.AMPUSAGE;

SELECT ACCOUNTNAME, USERNAME, SUM(CPUTIME) AS CPU, SUM(DISKIO) AS DISKIO FROM DBC.AMPUSAGE GROUP BY 1,2 ORDER BY 3 DESC;

Explain TPUMP (Teradata Parallel Data Pump) Utility in Teradata?

Explain TPUMP (Teradata Parallel Data Pump) Utility in Teradata?
Answers:

* TPUMP allows near real time updates from Transactional Systems into the Data Warehouse.
* It can perform Insert, Update and Delete operations or a combination from the same source.
* It can be used as an alternative to MLOAD for low volume batch maintenance of large databases.
* TPUMP allows target tables to have Secondary Indexes, Join Indexes, Hash Indexes, Referential Integrity, Populated or Empty Table, Multiset or Set Table or Triggers defined on the Tables.
* TPUMP can have many sessions as it doesn’t have session limit.
* TPUMP uses row hash locks thus allowing concurrent updates on the same table.

Explain Ferret Utility in Teradata?

Explain Ferret Utility in Teradata?

Ferret (File Reconfiguration tool) is an utility which is used to display and set Disk Space Utilization parameters within Teradata RDBMS. When you select the Ferret Utility parameters, it dynamically reconfigures the data on disks. We can run this utility through Teradata Manager; to start the Ferret Utility type (START FERRET) in the database window.

Answers:
Following commands can be used within Ferret Utility:

1. SHOWSPACE – Well this command reports you the amount of Disk Cylinder Space is in use and the amount of Disk Cylinder Space is available in the system. This will give you an information about Permanent Space cylinders, Spool Space cylinders, Temporary Space cylinders, Journaling cylinders, Bad cylinders and Free cylinders. For each of these 5 things it will present you 3 parameters i.e. Average Utilization per cylinder, % of total avaliable cylinders and number of cylinders.
2. SHOWBLOCKS – This command will help you in identifying the Data Block size and the number of Rows per data block. This command displays the Disk Space information for a defined range of Data Blocks and Cylinders.

Can you recover the password of a user in Teradata?

Can you recover the password of a user in Teradata?
Answers:

No, you can’t recover the password of a user in Teradata. Passwords are stored in this data dictionary table (DBC.DBASE) using a one-way encryption method. You can view the encrypted passwords using the following query

SELECT * FROM DBC.DBASE;

What is Reconfiguration Utility in Teradata and What it is used for?

What is Reconfiguration Utility in Teradata and What it is used for?
Answers:

* When we feed Primary Index value to Hashing Algorithm then it gives us Row Hash(32 bit number) value which is used to make entries into Hash Maps.
* Hash Maps are the mechansim for determining which AMP will be getting that row.
* Each Hash Map is an array of 65,536 entries and its size is close to 128KB.

When Teradata is installed on a system then there are some scrpits which we need to execute i.e. DIP Scripts. So it creates a Hash Maps of 65,536 entries for the current configuration. But what if you want to add some more AMPs into your system?

Reconfiguration (Reconfig) is a technique for changing the configuration (i.e. changing the number of AMPs in a system) and is controlled by the Reconfiguration Hash Maps. System builds Reconfiguration Hash Maps by reassigning hash map entries to reflect new configuration of system.

Lets understand this concept with the help of an example; suppose you have a 4 AMPs system which holds 65,536 entries. Each AMP is responsible for holding (65,536/4=16,384) 16,384 entries.

Now you have added 2 more AMPs in your current configuration so you need to reconfigure your system. Now each AMP would be responsible for holding (65,536/6=10922) 10,922 entries.

What is the difference between Sub-Query & Co-Related Sub-Query?

What is the difference between Sub-Query & Co-Related Sub-Query?
Answers:

When queries are written in a nested manner then it is termed as a sub-query. A Sub-Query get executed once for the parent statement whereas Co-Related Sub-Query get executed once for each row of the parent query.

Select Empname, Deptno, Salary from Employee Emp where Salary = (Select Max(Salary) from Employee where Deptno = Emp.Deptno) order by Deptno

What is the difference between Access Logging and Query Logging in Teradata?

What is the difference between Access Logging and Query Logging in Teradata?
Answers:

1. Access Logging is concerned with security (i.e. who’s is doing what). In access logging you ask the database to log who’s doing what on a given object. The information stored is based on the object not the SQL fired or the user who fired it.
2. Query Logging (DBQL) is used for debugging (i.e. what’s happening around ?). Incase of DBQL database keep on tracking various parameters i.e. the SQLs, Resource, Spool Usage, Steps and other things which help you understand what’s going on, the information is fruitful to debug a problem. Further DBQL is enabled on a User id rather than an object like say Table or so.

What is FILLER command in Teradata?

What is FILLER command in Teradata?
Answers:

While running Fastload or Multiload if you don’t want to load a particular field from the datafile to the target table then use the FILLER command to achieve this. Syntax for FILLER command would be as following:

.LAYOUT FILE_PRODUCT; /* It is input file layout name */
.FIELD Prod_No * char(11); /* To load data into Prod_No */
.FIELD Prod_Name * char(11); /* To load data into Prod_Name */
.FIELD Location * char(11); /* To load data into Location */
.FILLER Prod_Chars * char(20); /* To skip the value for the next 5 locations */

What are TPUMP Utility Limitations?

What are TPUMP Utility Limitations?
Answers:

Following are the limitations of Teradata TPUMP Utility:
* Use of SELECT statement is not allowed.
* Concatenation of Data Files is not supported.
* Exponential & Aggregate Operators are not allowed.
* Arithmatic functions are not supported.

What are the MultiLoad Utility limitations?

What are the MultiLoad Utility limitations?
Answers:

MultiLoad is a very powerful utility; it has following limitations:

* MultiLoad Utility doesn’t support SELECT statement.
* Concatenation of multiple input data files is not allowed.
* MultiLoad doesn’t support Arithmatic Functions i.e. ABS, LOG etc. in Mload Script.
* MultiLoad doesn’t support Exponentiation and Aggregator Operators i.e. AVG, SUM etc. in Mload Script.
* MultiLoad doesn’t support USIs (Unique Secondary Indexes), Refrential Integrity, Join Indexes, Hash Indexes and Triggers.
* Import task require use of PI (Primary Index).

What are the functions of a Teradata DBA?

What are the functions of a Teradata DBA?
Answers:

Following are the different functions which a DBA can perform:
1. User Management – Creation and managing Users, Databases, Roles, Profiles and Accounts.
2. Space Allocation – Assigning Permanent Space, Spool Space and Temporary Space.
3. Access of Database Objects – Granting and Revoking Access Rights on different database objects.
4. Security Control – Handling logon and logoff rules for Users.
5. System Maintenance – Specification of system defaults, restart etc.
6. System Performance – Use of Performance Monitor(PMON), Priority Scheduler and Job Scheduling.
7. Resource Monitoring – Database Query Log(DBQL) and Access Logging.
8. Data Archives, Restores and Recovery – ARC Utility and Permanent Journals.

What are the 5 phases in a MultiLoad Utility?

What are the 5 phases in a MultiLoad Utility?
Answers:

* Preliminary Phase – Basic Setup
* DML Phase – Get DML steps down on AMPs
* Acquisition Phase – Send the input data to the AMPs and sort it
* Application Phase – Apply the input data to the appropriate Target Tables
* End Phase – Basic Cleanup

How to eliminate Product Joins in a Teradata SQL query?

How to eliminate Product Joins in a Teradata SQL query?
Answers:

1. Ensure statistics are collected on join columns and this is especially important if the columns you are joining on are not unique.
2. Make sure you are referencing the correct alias.
3. Also, if you have an alias, you must always reference it instead of a fully qualified tablename.
4. Sometimes product joins happen for a good reason. Joining a small table (100 rows) to a large table (1 million rows) a product join does make sense.

How does Hashing happens in Teradata?

How does Hashing happens in Teradata?

Answers:

* Hashing is the mechanism through which data is distributed and retrieved to/from AMPs.
* Primary Index (PI) value of a row is the input to the Hashing Algorithm.
* Row Hash (32-bit number) value is the output from this Algorithm.
* Table Id + Row Hash is used to locate Cylinder and Data block.
* Same Primary Index value and data type will always produce same hash value.
* Rows with the same hash value will go to the same AMP.

So data distribution depends directly on the Row Hash uniqueness; be careful while Choosing Indexes in Teradata.