5.0 Embedded software design
5.1 Introduction.
The embedded software for decoding the PAL signal is a DSP mathematical
algorithm combined with control algorithms to load data to be processed and
send processed data to a display system.
The control algorithms are named load line and send line
respectively. The control algorithms rely upon the PAL synchronisation pulses
at the beginning of each line to allow the algorithm to know when to load and
send individual lines. A new line is loaded every time a sync pulse is detected
when sending data. The line processing algorithm is designed to process each
line and to extract the colour information correctly. In order to do this a
number of issues must be addressed. These are QAM demodulation, Phase locking
and phase shifting. See appendix A1 – A3 for a data flow diagram.
5.2 Load Line
5.2.1 Introduction
This function is to load a complete line from external memory into TMS320
internal memory. In order to do this the function must be able to detect the
beginning of a new line and the end of that line.
5.2.2 Method
The video conveniently contains line synchronisation pulses at the beginning
and end of each line, thus by detecting the sync pulses it is possible to load
complete lines. Sync pulses manifest them selves as 0 data values, all values
are of 40 or greater are image values.
For the code to detect a new line it must receive data that indicates
the end of a sync pulse and the start of data. If a data value is loaded but
a sync pulse has not been received prior to the data then all data loaded will
be discarded until the next sync pulse is detected.
When sampling at 20 MHz a sync pulse lasts for n samples. It is clear
that loading this data is a waist of memory as it holds no image data. Therefore
each successive sync sample is loaded and discarded. A sync test loop is entered
when the first sync value is detected.
When the first value after the sync pulse is detected data is loaded
successively into an array. Finally when the next sync pulse is detected two
sync values are stored to indicate the end of line and the process is repeated
when required.
See Appendix A4 for ASM code and flow diagram
5.2.3 Pseudo Code
This is shown in the sudo code below.
Load: Load data value
If data value is >0 then goto Load:
Zeros: Load data value
If data =0 then goto Zeros:
Store previous non-zero value in array
Inc array pointer
Data: Load data value
If data value is >0 then
{
Store non-zero value in array
Inc array pointer
Goto Data
}
Store zero value in array
Inc array pointer
Store zero value in array
Inc array pointer
5.3 Colour burst phase change compensation
5.3.1 Introduction
This algorithm must detect whether the colour burst is at +135 or –135
relative to the end of the line synchronisation sync pulse.
5.3.2 Method
This is done by using two lookup tables containing sinusoidal waveforms,
that are at 0 and 90 degrees relative to the end of the sync pulse. The two
tables are used as non-coherent QAM demodulation signals. The colour burst is
multiplied by the 0 and 90 waves and stored in results tables. The results are
then filtered leaving a DC value an each table. The DC value corresponds to
the phase of the colour burst. The burst phase can only be one of two values.
Since the demodulation signal is non-coherent the DC values will have an error.
The error will affect both DC values, therefore the best way to determine the
phase of the burst is to compare the DC values. If the 0 degree table is greater
then the 90 table then the burst phase is 135. If the 90 degree table is greater
then the 0 table then the phase is –135.
This method of non-coherent is not appropriate for the PAL demodulation
as the error would be significant and result in the colour pallet being changed,
thus reproducing wrong colours.
See Appendix A5 for ASM code and flow diagram
5.3.3 Colour burst phase change compensation pseudo code
Note: A sliding window filter as shown in the theory is used.
Set register A to starting address of raw pal data
Set register B to starting address of sin modulation table
Set register C to starting address of sin results table
Clear the accumulator
While (sample_loop<64) do
{
While (mean_loop<9) do
{
Multiply the data value at the address pointed to by register
A by the data value at the address pointed to by register B
Add the resultant value to the accumulator
Inc register A
Inc register B
}
Divide the accumulator by 9
Store the resultant value in the location pointed to by register
C
Add 1 to register C
Subtract 8 from register A
Subtract 8 from register B
}
Set register A to starting address of raw pal data
Set register B to starting address of cos modulation table
Set register D to starting address of cos results table
Clear the accumulator
While (sample_loop<64) do
{
While (mean_loop<9) do
{
Multiply the data value at the address pointed to by register
A by the data value at the address pointed to by register B
Add the resultant value to the accumulator
Inc register A
Inc register B
}
Divide the accumulator by 9
Store the resultant value in the location pointed to by register
C
Add 1 to register C
Subtract 8 from register A
Subtract 8 from register B
}
If C>D set + (burst phase in 135 degrees)
If D>C set - (burst phase is –135 degrees)
Return phase.
See appendix for assembly code and flow chart.
5.4 Phase Lock
5.4.1 Introduction
This algorithm is used to compare two sinusoidal waves of the same frequency
and similar magnitudes and return a value corresponding to the number of degrees
that the waves are out of phase.
5.4.2 Method
After determining weather the burst is 135 or -135 degrees relative to
the sync pulse a lookup table starting with an offset of 135 or -135 is selected.
This table will then be used in the phase locking process.
The waveforms will be in a sampled form and held in memory tables. Using
table pointers the registers of the tables are sequentially compared to one
another. The difference between each compared register pair is stored in a third
table. The difference values stored in the third table will correspond to the
degree that the waves are out of phase. i.e. a large error will show that the
waves are far out of phase and a small value will show that the waves are closely
matched.
If after a comparison the error is large the reference wave is offset
by one table value and the tables are re-compared. This process is repeated
until a small error value is found.
It is necessary to make a number of comparisons because if only one comparison
was it is possible that the difference between the two waves would be zero but
they are still out of phase. This is shown by point x on the graph below.

Since the algorithm is to be implemented on a TMS320 DSP microprocessor
it is necessary to ensure that the algorithm is easy to convert into assembly
code. The TMS 320 can work with negative integers, but however when transferring
a negative value from a 32 bit register to a 16 bit register the word will be
truncated, thus the sign bit will be lost and the resultant value will be meaning
less. The comparison is done by subtraction using the 32 bit accumulator but
the data is stored in 16 bit memory. In order to allow the TMS320 to cope with
negative difference values it is necessary to either store that 32 bit word
as two 16 bit words on only store valued that have positive difference values.
The second method was chosen because it was simpler. The algorithm will therefore
replace all 32 bit negative error values with 0 error before storing them in
the 16 bit registers.
The closeness of the waveforms is a function of the number of comparisons
and the difference between the values compared. It is obvious that making lots
of comparisons will a high tolerance for error will have similar results to
making only a few comparisons with a low tolerance.
See Appendix A6 for ASM code and flow diagram
5.4.3 Pseudo code for phase lock
x= the number of comparisons
y= margin for error
Set error flag =0
Set offset =0
Load table A with a reference sine wave of the same frequency and similar
magnitude to the burst sine wave.
Set pointer A to start of table A
Load table B with x values of data form the colour burst
Set pointer B to start of table B
Define Table C
Set pointer to start of table C
Loop0: Subtract value at pointer A + offset from value at pointer B.
If resultant value is negative store zero at address pointed to by pointer
C
Else store resultant value address pointed to by pointer C
Inc pointer A
Inc pointer B
Inc pointer C
Inc loop count
If loop count < x then go to loop:
Set pointer C to start of Table A
Set pointer B to start of Table B
Set pointer B to start of Table C
Loop1: If value at address pointed to by pointer >y then set error
flag =1
If error flag=0 then go to Locked:
Else increment offset go to Loop0:
Locked: Return offset value required to match the two sinusoidal waves.
5.5.4 Conclusion
This method is good because two parameters define the closeness of the
lock. A large number of consecutive matches can statistically indicate a good
match, even if the error between each match is high. A small number of matches
with a small error between consecutive values also show a good match.
5.5 Digital Demodulation and Filtration 5.5.1 Introduction
The PAL chrominance signal contains two components that have been modulated
using quadrature amplitude modulation (QAM). QAM is a is method where by two
signals are modulated with the same modulation frequency but 90°
out of phase and then added together.
5.5.2 Method
To demodulate the signals, two demodulation carriers are required. This
is shown below. The resulting signals contain frequencies of sin4p
fsubt, which must be filtered out using a low pass filter. See Theory.
.
This can be done digitally by using lookup tables. As described in (load
line) a complete line of binary digitised PAL data is loaded into memory. Two
other tables are used that contain fixed sinusoidal and co-sinusoidal data respectively
of the same frequency as the chrominance QAM carrier. By using the offset value
derived in the phase lock stage the table is shifted by the offset value so
that the sin and cosine components are in phase with the sine and cosine tables
respectively. The chrominance signal value for each table position is then multiplied
by the respective sine and cosine table position value. This is shown below.

See Appendix A7 for ASM code and flow diagram
5.5.3 QAM demodulation pseudo code.
Set register A to starting address of raw pal data
Set register B to starting address of sin modulation table
Set register C to starting address of sin results table
Add offset from phase lock algorithm to B
Clear the accumulator
While (sample_loop<1188) do
{
While (mean_loop<9) do
{
Multiply the data value at the address pointed to by register
A by the data value at the address pointed to by register B
Add the resultant value to the accumulator
Inc register A
Inc register B
}
Divide the accumulator by 9
Store the resultant value in the location pointed to by register
C
Add 1 to register C
Subtract 8 from register A
Subtract 8 from register B
}
Set register A to starting address of raw pal data
Set register B to starting address of cos modulation table
Set register D to starting address of cos results table
Add offset of phase lock to B
Clear the accumulator
While (sample_loop<64) do
{
While (mean_loop<9) do
{
Multiply the data value at the address pointed to by register
A by the data value at the address pointed to by register B
Add the resultant value to the accumulator
Inc register A
Inc register B
}
Divide the accumulator by 9
Store the resultant value in the location pointed to by register
C
Add 1 to register C
Subtract 8 from register A
Subtract 8 from register B
}
A contains raw data
C contains U signal
D contains V signal
5.6 Transmit Line
5.6.1 Introduction.
After calculating the values for U and V and placing the results in tables,
there are then three data tables, these are raw data, U data and V data. The
data is sufficient to draw a complete line onto a visual display unit, however
in order to do this the data must be transferred from the TMS320 to the PC.
The hardware is designed to allow the corresponding Raw data, U data and V data
to be transmitted in one cycle. The software can then increment all three pointers
and transmit the three bytes for the next pixel.
5.6.2 Method
The TMS320 supports a function called IDLE, this causes the TMS320 to
go into suspend mode until an interrupt is received. After the processor has
executed all previous program stages and has a complete data array the microprocessor
goes into idle mode and waits for an interrupt. Upon receiving an interrupt
three bytes are sequentially written to the parallel port, however between each
write there is a short delay to help the PC’s parallel port to remain synchronised.
The parallel ports’ wait signal is used to ensure it is perfectly synchronised.
The data can be both positive or negative so a method of allowing the PC to
recognise both positive and negative values. This was done by dividing the data
by two and adding 128 to it. Therefore a data value of 0 will be sent as 128,
a data value of –128 will be sent as 0 and a value of + 127 will be sent as
255. When three bytes have been transmitted the processor increments the table
pointers and goes back in to the idle state. The above process is repeated until
a zero value is detected in the raw data table. This signifies the end of a
line. The processor then sets the program counter to start the whole process
again. i.e it branches to Load Line.
See Appendix A8 for ASM code and flow diagram
5.6.3 Pseudo code for Send Line.
Set pointer for Raw data table to 0
Set pointer for U data table to 0
Set pointer for V data table to 0
Send data:
IDLE
Divide U data by 2
Add 128 to U data
Send U data to parallel port
Wait for n processor cycles
Divide V data by 2
Add 128 to V data
Send U data to parallel port
Wait for n processor cycles
Divide Raw data by 2
Add 128 to Raw data
Send Raw data to parallel port
Wait for n processor cycles
Inc U data pointer
Inc V data pointer
Inc Raw data pointer
If Raw data = 0 then exit and return to the start of the entire algorithm
IF Raw data > 0 go to the Send data label.