COEN 20 - Ch 9

Floating Point

Martin Rios-Cardenas (lvl 5)
Floating Point (recall)

Preview this deck

What is floating-point?

Front

0.0

0 reviews

 5 0 4 0 3 0 2 0 1 0

Active users

1

All-time users

1

Favorites

0

Last updated

10 months ago

Date created

Nov 30, 2020

Cards(66)

Floating Point (recall)

(3 cards)

What is floating-point?

Front

The representation for non-integral numbers

• Including very small and very large numbers
Back

What are some examples of floating-point numbers?

Front
1. Scientific notation
• $$-2.34 * 10^{56}$$
• $$+0.002*10^{-4}$$
• $$+987.02*10^9$$
2. Binary
• $$\pm 1.xxxxxxx_2 * 2^{yyyy}$$
3. Types in C
• float (single-precision FP)
• double (double-precision FP)
Back

How does the floating-point work in binary?

Front

$$/pm 1.xxxxxxx_2 * 2^{yyyy}$$ (for example, $$1.0_2*2^{-1}$$

The decimal point represents the binary point.

The x's represent the fraction.

The first 2 represents base 2.

The y's represent the exponent.

There's a tradeoff between precision and range.

Back

IEEE 754 Floating-Point Format

(1 card)

What is the IEEE 754 Floating-Point Format?

Front

We have this example where x = -11111111.01 x 2^7

The s equals the sign positive or negative (1 bit).

The (1+Fraction) is the fraction (single 23 bits, double 52 bits) [The fraction takes the LSH completely]

The 2^(actual exponent) is the stored exponent (single 8 bits, double 11 bits).

Back

Floating-Point Data Type

(1 card)

What is the single-precision floating-point (float) type?

Front

Back

float versus int32_t

(1 card)

What is the difference if int32_t y = 1000 and float x = 1000. (note on the floating point)?

Front

int32_t y

vs

float x

Here, we have taken into account the sign bit as the MSB. Then, we took into account the excess 127, and with the exponent given by the offset of 9, we have 136_10 as our 8-bit exponent. Finally, the decimal/fraction part follows where we have the MSB to the floating-point right next to it. We zero-extend to the right if necessary.

Back

Floating-Point example

(4 cards)

How do we represent 0.75 in IEEE 754 in single-precision format?

Front

Convert the number to binary.

$$0.75_10 \tp 0.11_2$$

Normalize the number.

$$0.11_2 \to 1.1 * 2^{-1}$$

With our given exponent, we can calculate the exponent part.

$$-1 + 127 = 126$$

Convert to binary and store in the next 8 positions after the MSB, which is 0 (positive).

$$126_{10} \to 01111110_2$$

Insert the remaining decimal/fractional part in the next bit positions.

Back

How do we represent decimal -0.5 in IEEE 754 single-precision format?

Front

We take into account the MSB is 1 since the number is negative.

Convert decimal into binary

$$0.5_{10} \to 0.1_2$$

Normalize the number

$$0.1_2 = 1 * 2.0^{-1} Given our exponent, we calculate the excess$$ -1 + 127 = 126 $$Convert the excess into binary$$ 126_{10} /to 01111110_2 

Save this excess in the corresponding bit positions after the MSB.

Save the fractional part in the remaining bit positions.

Back

What is the binary representation of the binary number 111111.01 x 2^0 in the IEEE 754 single-precision format?

Front
Back

How do we represent decimal 2.0 in IEEE 754 single-precision format?

Front

Convert the whole part to binary.

Normalize the binary.

Given the exponent, add that exponent to 127 and store it after the MSB, which is 0(positive).

Convert to binary before doing so as an 8-bit number.

Store the remaining decimal part to their respectable bit positions.

Back

Single-precision Floating-point Representing Zero and One

(2 cards)

How do we represent 0 in single-precision floating-point?

Front

We just know both the stored exponent and fractional part are 0 in binary.

Back

How do we represent 1 in single-precision floating-point?

Front

In binary, we know 1 is 0001.

We can normalize it to 1.0 x 2^0.

This gives us the exponent part to be 127 and the fractional part to be all 0's.

Back

Floating-Point Registers

(3 cards)

Can floating point registers S0 to S15 be modified?

Front

These may be modified by functions

Back

Can floating-point registers S16 to S31 be modified?

Front

Preserved by functions; no

Back

What are floating-point registers D0-D15 used for?

Front

They are used to hold 64-bit values. Yet, we can't perform 64-bit arithmetic.

Back

Floating-point unit

(5 cards)

What are the processors used in this floating-point unit?

Front
• We have the Integer Processor
• We hold the registers R0 to R12
• We have APSR flags
• We have more familiarity with this processor
• We have the FP (floating-point) Processor
• We hold the registers S0 to S31
• We have FPU flags

Back

How does the FP processor process a constant as an input?

Front

We use the VMOV rule for only a few values. Not recommended.

Back

How does the integer processor interact with the main memory?

Front

We have the LDR and STR instructions.

Back

How does the FP processor interact with the main memory?

Front

We have the VSTR and VLDR instructions.

Back

How does the integer processor process a constant as an input?

Front
• We can copy constants from one register to another with MOV, MVN, MOVW, and MOVT registers
• We can use LDR as a pseudo-instruction to load a constant to a register
Back

Floating-Point Push & Pop

(2 cards)

How do we push FPU registers?

Front

We use the VPUSH instruction. Use at function entry to preserve registers S16 to S31 that are modified.

Back

How do we pop FPU registers?

Front

We use the VPOP instruction. Use at function exit to restore registers S16 to S31 that are modified.

Back

Function Parameters and Return Values

(2 cards)

How does the function call look for floating-point parameters?

Front
Back

How does a function with floating-point parameters look in assembly?

Front
Back

(2 cards)

How does VLDR work?

Front

Unlike LDR, we cannot do pseudo-instructions. We need to define a function label for the number with .float as the instruction. We can then use VLDR if this function is defined in the file.

Back

How does VMOV work?

Front

It works like MOV except we can work with a limited set of constants.

Back

VMOV Immediate Constants

(1 card)

What are some of the VMOV immediate constants?

Front

Yup

Back

Copying Floating-Point Data -- Register -> Register

(1 card)

What are some of the functions for VMOV?

Front

NOTE: These only copy data. They do NOT convert between integer and floating-point representations.

Back

Copying Floating-Point Data -- Memory -> (Single or Double) Registers

(2 cards)

How does VSTR work?

Front
Back

How does VLDR work?

Front
Back

Converting between Integer and Floating Point

(1 card)

How does VCVT work?

Front
Back

Rounding Modes

(1 card)

What are the rounding modes?

Front
• Round to
• Nearest even (default)
• Round toward pos inf
• Round toward neg inf
• Round towards zero
Back

float versus int32_t

(2 cards)

How do you define

float x = -1000.5 //or -1111101000.1_2
Front
Back

How do you define

int32_t y = -1000
Front
Back

Arithmetic with real numbers

(4 cards)

Solve the following Pythagorean Theorem function in assembly

float Hypotenuse(float side1, float side2)
Front
Hypotenuse:
VMUL.F32	S0,S0,S0
// S0 = side1 * side1
VMLA.F32	S0,S1,S1
// S0 += side2 * side2
VSQRT.F32	S0,S0
// S0 = square root of S0
BX	LR	// Return
Back

Find the discriminant in assembly

float Discriminant(float a, float b, float c){
return b * b –4.0 * a * c ;
}

Front
Back

Find the volume of a cube in assembly

float VolumeOfCube(float height, float width, float depth){
return height * width * depth;
}
Front
Back

What kind of arithmetic can you do with real numbers and floating-point?

Front
Back

Arbitrary Floating-Point Constants

(1 card)

Find the Area of a circus given a radius.

float AreaOfCircle(float radius){
}
Front
Back

Using Expressions to Create Constants

(1 card)

Calculate the volume of a sphere given a radius

float VolumeOfSphere(float radius){
return (4.0 / 3.0) * 3.14159 * radius *
}
Front
Back

Comparing Real numbers

(1 card)

How can we compare floating-point numbers?

Front
Back

Interpreting Flags After VCMP

(1 card)

What are the possible conditions where we can compare two floating-point numbers?

Front
Back

Copying 0 to a Floating-Point Register

(1 card)

How do we copy 0 to a FP register?

Front

Given the issues with VMOV S0,0.0, we are left with

VSUB.F32 S0,S0,S0
Back

Pointer to an Array of Floats

(4 cards)

Given this function

float TaylorPoly(float x, float coef_a[], int32_t terms)

what do we know about it?

Front

FP reg S0: x

int reg R0: &coef_a

int reg R1: terms

Back

When we pass an array of floats, where is this address passed to?

Front

Integer registers R0-R3

Back

How can we load the values of an array of floats to our registers?

Front

VLDMIA  R0!,{S1,S2,S3}

Back

When we pass an array of floats as a fn parameter, what are we passing?

Front

The address of the array that points to the first element of the array

Back

Preserving Floating-Point Registers

(1 card)

Find the volume of a cone in assembly

float VolumeOfCone(float radius, float height){
return AreaOfCircle(radius) * height / 3.0 ;
}
Front
Back

Floating-Point Compare & Flags

(1 card)

int32_t ImaginaryRoots(float a, float b, float c){
return Discriminant(a, b, c) < 0.0 ;
//returns 1 or 0
}
Front
Back

FPU Instructions in IT Blocks

(1 card)

float LimitedIncrement(float a, float b){
if (a < b) a += 1.0 ;
return a ;
}
Front
Back

Floating-Point Equality Test

(1 card)

Front
Back

Extended Floating-Point Example

(1 card)

float QuadraticRoot(float a, float b, float c, intminus){
float root = sqrt(Discriminant(a, b, c)) ;
float top ;
if (minus)	top = -b –root ;
else		top = -b + root ;
}
Front
Back

FPU Programming - The Essentials

(11 cards)

Can we use constants as expressions for FP instructions?

Front
Back

What registers belong to float?

Front

Float: S0-S15

Else: R0-R3

Back

Can we write pseudo-instructions with VLDR like with LDR?

Front

No. Use .float to create a constant in memory, add a label to it, and use VLDR Sn,label to load it.

Back

Can we use VMOV with many integers?

Front

No. VMOV supports a very restricted set on immediate constants. Easiest to only use it to load small integers (like 4.0) and some simple fractions (like 0.5).

Back

Do we need to specify the operand type for certain instructions?

Front

Yes. All instructions that perform arithmetic, data type conversion, or compares must specify the operand type, as in VADD.F32.

• VCVT requires two specifiers (VCVT.F32.S32).
Back

Do we append a condition code to an FPU instruction?

Front

Yes. In an IT block, append the condition code to an FPU instruction BEFORE appending the data type specifier, as in VADDLE.F32

Back

Do we need to use VMRS after VCMP?

Front

Yes. Comparing two FPU values requires VCMP followed by VMRS APSR_nzcv,FPSCR before the conditional branch or IT block.

Back

How do we preserve FPU regs?

Front

By copying S0-S15 to R4-R8 and push/pop R4,R8

Back

Do VPUSH and VPOP work with int regs?

Front

No, only FPU regs.

Back

Does VMOV convert an integer to float and vice-versa?

Front

No. That requires a combination of VMOV and VCVT.

• VCVT.S32.F32 Sd,Sm
• // Sd <- (float) Sm, where Sm is a 2’s comp integer
• VMOV Rn, Sd
Back

What are the addressing modes for VSTR?

Front

[Rn] and [Rn, constant]

Back

FPU instruction cycle counts

(1 card)

What is the cycle counts for FPU instructions?

Front
Back

Division by an Arbitrary Constant

(2 cards)

How can you float divide using reciprocal multiplication?

Front
Back

How can you divide using integer multiplication?

Front
Back