• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

The Math Help Thread

Status
Not open for further replies.
12239536_10156351770855372_6142055492237930193_n.jpg


Anyone?
I've no idea how to even start with this.

Well, I can help with the first part :p

f(x + y) = f(x) + f(y) for all x and y

So, if you take x = y = 0

Then f(0 + 0) = f(0) + f(0)

f(0) = 2 f(0)
0 = 2 f(0) - f(0)
So, we have that f(0) = 0

I'm not very good with continuity.. Sorry

What I can see is that if you take y = -x, then f(x - x) = f(x) + f(-x)

So, 0 = f(0) = f(x) + f(-x)

This means that f(-x) = -f(x)
So, f(x) is an odd function

But thats all I can think
 
Guys I have a very hard problem to ask.

We know we cant divide by zero so what happens when this occurs.

(2) / (3/0) = (2)*(0/3) = 0

is this allowed, we are dividing by (3/0)!
 
Well, I can help with the first part :p

f(x + y) = f(x) + f(y) for all x and y

So, if you take x = y = 0

Then f(0 + 0) = f(0) + f(0)

f(0) = 2 f(0)
0 = 2 f(0) - f(0)
So, we have that f(0) = 0

I'm not very good with continuity.. Sorry

What I can see is that if you take y = -x, then f(x - x) = f(x) + f(-x)

So, 0 = f(0) = f(x) + f(-x)

This means that f(-x) = -f(x)
So, f(x) is an odd function

But thats all I can think


We can solve the second using limits.

We are given that,

lim_{x -> 0} f(x) = f(0).

Let a be a real number. Then,

lim_{x -> a} f(x) = lim_{x - a -> 0} f(x - a + a).

For convenience, let y = x - a. Because f is additive and f(0) = 0, we get that,

lim_{x -> a} f(x) = lim_{y -> 0} f(y + a) = lim_{y -> 0} f(y) + f(a) = f(a).

Hence, f is continuous at x = a.
 
Guys I have a very hard problem to ask.

We know we cant divide by zero so what happens when this occurs.

(2) / (3/0) = (2)*(0/3) = 0

is this allowed, we are dividing by (3/0)!

Well, the statement

2/(3/0) = 2*(0/3) is the problem

Numbers can have something called Multiplicative Inverse

The multiplicative inverse of x (normally called x^-1) is a number y, in a way that x*y = 1

This means y = 1/x

One important propriety of the inverse is that, the inverse of the inverse is the original number

so x = (x^-1)^-1

This is pretty clear, as 1/(1/x) = x

(2) / (3/0) = (2)*(0/3) uses this propriety, can you notice?

it says that 2 / (3/0) = 2*(0/3)

Its saying that 1/(3/0) = 0/3, the propriety I said above

But thats false, because 0 has no multiplicative inverse, there's no number that when multiplied with 0 results in 1. Because of this, you can't use that propriety

Using limits, its actually possible to proof that lim x->0 [2/(3/x)] = lim x->0 [2*(x/3)] But thats a different thing
 
We can solve the second using limits.

We are given that,

lim_{x -> 0} f(x) = f(0).

Let a be a real number. Then,

lim_{x -> a} f(x) = lim_{x - a -> 0} f(x - a + a).

For convenience, let y = x - a. Because f is additive and f(0) = 0, we get that,

lim_{x -> a} f(x) = lim_{y -> 0} f(y + a) = lim_{y -> 0} f(y) + f(a) = f(a).

Hence, f is continuous at x = a.

Ahhhh. Thanks. That makes sense now.
Knew I was missing something simple in the question.
 
I've got this question about an integral. I've got a Gaussian integral and we've been given that:

k8kPHnc.png


The thing is, when I type this into wolfram alpha for k = 1/2, 3/2 etc, it comes up with 0. Whats the right answer? The question is more broadly about the Gram-Schmidt orthonormalisation process, and I have no clue what's going on with this integral..
 
I've got this question about an integral. I've got a Gaussian integral and we've been given that:

k8kPHnc.png


The thing is, when I type this into wolfram alpha for k = 1/2, 3/2 etc, it comes up with 0. Whats the right answer? The question is more broadly about the Gram-Schmidt orthonormalisation process, and I have no clue what's going on with this integral..

That looks like the gamma function. You should look it up.
 
I've got this question about an integral. I've got a Gaussian integral and we've been given that:

k8kPHnc.png


The thing is, when I type this into wolfram alpha for k = 1/2, 3/2 etc, it comes up with 0. Whats the right answer? The question is more broadly about the Gram-Schmidt orthonormalisation process, and I have no clue what's going on with this integral..


k is supposed to be positive integers 1,2,3,....

I don't think you are supposed to put in fractions.
 
I've got this question about an integral. I've got a Gaussian integral and we've been given that:

k8kPHnc.png


The thing is, when I type this into wolfram alpha for k = 1/2, 3/2 etc, it comes up with 0. Whats the right answer? The question is more broadly about the Gram-Schmidt orthonormalisation process, and I have no clue what's going on with this integral..

EDIT: Oops, upon reading again it doesn't look like you want a derivation. Anyways, for a positive half-integer k that integral is trivially zero by the oddness of the integrand. Note: that the condition that 2k-1 be odd prevent positive half-integer values of k.

Looks like k is supposed to be an integer. Or at least it must be by how I derived the formula.

Here is the basic line of thinking:
we know that Integral[ exp(-a*x^2) dx, -inf, inf ] = sqrt(pi/a) for a > 0 and const w.r.t. x.

Note that: (d/da) [exp(-a*x^2)] = -x^2 * exp(-a*x^2),
where (d/da) represents the derivative with respect to a.

Using this relation, you can evaluate the integral of x^2*exp(-a*X^2) in terms of the original integral. You should be able to find a pattern.


Yes you can, it's the gamma function. There are no discontinuities there.
Gamma function has no Gaussian in it, but goes exp(-x)
 
My professor said something to me that was kind of strange. He asked we thought he should make attendance a part of his grading next semester to encourage students to come. I said that only if he made the attendance grade a part of the percentage of the grade, and not just some penalty that can only hurt you. He said that would be the exact same thing. However, isn't it true that making the attendance grade a part of your grade would dampen the other scores?

Let's say that attendance was worth 10% of your grade and the remaining 90% came from 3 tests. Let's say you get a 90 on all tests and 0 on the.attendance part. Then your grade would be 90*.9 + 0*.1 = 81.

If instead the 3 test scores were worth 100% of your grade and you could lose up to 10 points on your grade average due to missing classes, then getting a 90 on all tests would give you an average of 90. And if you missed enough classes to lose that same 10% penalty to your grade, you'd get a 80.

80 != 81

He was pretty insistent that they would be the same.
 
Gamma function has no Gaussian in it, but goes exp(-x)

Oh I was thinking the following, make the integral from 0 to infinity instead and multiply it by two. Then make a change of variable x^2 = u. And it does give a form of the gamma function. k = 1/2 seems equivalent to Gamma(1) for example. Maybe my logic is flawed.
 
My professor said something to me that was kind of strange. He asked we thought he should make attendance a part of his grading next semester to encourage students to come. I said that only if he made the attendance grade a part of the percentage of the grade, and not just some penalty that can only hurt you. He said that would be the exact same thing. However, isn't it true that making the attendance grade a part of your grade would dampen the other scores?

Let's say that attendance was worth 10% of your grade and the remaining 90% came from 3 tests. Let's say you get a 90 on all tests and 0 on the.attendance part. Then your grade would be 90*.9 + 0*.1 = 81.

If instead the 3 test scores were worth 100% of your grade and you could lose up to 10 points on your grade average due to missing classes, then getting a 90 on all tests would give you an average of 90. And if you missed enough classes to lose that same 10% penalty to your grade, you'd get a 80.

80 != 81

He was pretty insistent that they would be the same.

He probably didn't mean that everything will always be numerically the same, he means there wouldn't be substantively different outcomes either way. You pushed back on what he wanted because you thought it was unfair, and your contrived example to illustrate the injustice was a 1% difference in overall grade. :p

The most extreme examples are 100 grade, 0 attendance and 0 grade, 100 attendance:
100 grade, 0 attendance: Your scenario: 90. His scenario: 90
0 grade, 100 attendance: Your scenario: 10. His scenario: 0
Hard to imagine he'd find those to be substantively different.

If you really want to illustrate an example that would be substantively different, a student with a 49 and perfect attendance passes in your scenario and fails in the professor's scenario. A student with a 55 and no attendance fails in both, but a student with a 55 and poor attendance passes in yours and not his. In effect, your scenario only benefits students when their attendance grade is higher than their earned grade (this should be obvious if you think about how a weighted mean works).

Of course, the professor might argue that no one should pass just because of attendance and as long as the proportion of the grade that comes from grades is high and the proportion of attendance low, the magnitude of the difference is quite limited.
 
Oh I was thinking the following, make the integral from 0 to infinity instead and multiply it by two. Then make a change of variable x^2 = u. And it does give a form of the gamma function. k = 1/2 seems equivalent to Gamma(1) for example. Maybe my logic is flawed.

You are right. The integral is equivalent to Gamma(k+1/2) with that substitution. But, the I don't think it's possible to get an exact answer unless k is a positive integer.
I also am not readily familiar with all the properties of the gamma function though.

Edit: since he mentions Gram-Schmidt, I assume that integral will show up by using Hermite polynomials since they are orthonormal w.r.t. to a Gaussian weight function, which also makes me think that k is a positive integer
 
You are right. The integral is equivalent to Gamma(k+1/2) with that substitution. But, the I don't think it's possible to get an exact answer unless k is a positive integer.
I also am not readily familiar with all the properties of the gamma function though.

But I thought all spelltropy wanted was for positive half integer k. If it were negative, the half integer k would mean discontinuity though.
 
I'm not sure I'm following. I am not saying that k can be negative.

Sorry, I didn't make myself clear. All I wanted to say was that I thought only positive is what the poster was looking for. But your edit above clarified things and you think the same.
 
Sorry, I didn't make myself clear. All I wanted to say was that I thought only positive is what the poster was looking for. But your edit above clarified things and you think the same.

It's my fault. I'm just so used to being asked for derivations that I jumped the gun here.
 
Yep, I noticed the fact that for x, x^3, x^5 etc the function under the integral is odd, so would be 0. So is it just that this hint is missing that the solution given only holds for integer k? I did the question assuming so (and taking every integral with an odd power of x in it as equalling zero), and the answer (the orthonormal basis) came out quite nicely, so I'm gonna assume its correct.
 
Yes you can, it's the gamma function. There are no discontinuities there.

The gamma function generalizes the factorials, but a factorial is not the same as the gamma function.

As a rule, if it is written as a factorial, it only applies to integers. The RHS only has (double) factorials
 
He probably didn't mean that everything will always be numerically the same, he means there wouldn't be substantively different outcomes either way. You pushed back on what he wanted because you thought it was unfair, and your contrived example to illustrate the injustice was a 1% difference in overall grade. :p

The most extreme examples are 100 grade, 0 attendance and 0 grade, 100 attendance:
100 grade, 0 attendance: Your scenario: 90. His scenario: 90
0 grade, 100 attendance: Your scenario: 10. His scenario: 0
Hard to imagine he'd find those to be substantively different.

If you really want to illustrate an example that would be substantively different, a student with a 49 and perfect attendance passes in your scenario and fails in the professor's scenario. A student with a 55 and no attendance fails in both, but a student with a 55 and poor attendance passes in yours and not his. In effect, your scenario only benefits students when their attendance grade is higher than their earned grade (this should be obvious if you think about how a weighted mean works).

Of course, the professor might argue that no one should pass just because of attendance and as long as the proportion of the grade that comes from grades is high and the proportion of attendance low, the magnitude of the difference is quite limited.

He was arguing that it is literally the same. I can see how my example of 1 point isn't a huge different though. My guess is he did the 100/0 example in his head and got 90 for both and just jumped to a conclusion without thinking it through.
 
Yep, I noticed the fact that for x, x^3, x^5 etc the function under the integral is odd, so would be 0. So is it just that this hint is missing that the solution given only holds for integer k? I did the question assuming so (and taking every integral with an odd power of x in it as equalling zero), and the answer (the orthonormal basis) came out quite nicely, so I'm gonna assume its correct.

well The hint says that 2k-1 = n is > 0 and odd, which restricts k to be a positive, integer.
 
The gamma function generalizes the factorials, but a factorial is not the same as the gamma function.

As a rule, if it is written as a factorial, it only applies to integers. The RHS only has (double) factorials

I thought the poster wanted to generalize it, basically.
 
Algebra related: given a field F and the field of rational functions over F (denoted by F(x)), is it always deg(F(x)/F) = infinite?

I think not for special cases of fields, like those with a characteristic of 2, but can't quite show it.

Edit: I think I answered it myself.

What is true is that the characteristic of F(x) is finite (in fact p) if char(F) = p, however F(x) is infinite despite having a non-zero characteristic.

x is by definition transcendental over F so deg(F(x)/F) should be infinite (finite extensions are only possible by adjoining algebraic elements).
 
Bit of probability work here:
I have a Normal Random Variable, with distribution N(-1/sqrt(2), 1). We have to find the law of this. Does anyone have any idea? Been looking through class notes all week and just no idea how to approach this - they barely explain what the law is; by all accounts it's just P(a<x<b) for an input [a,b] but no indication of how to calculate it at all. Supposedly using limit theorems in some way, but no clue which to use. Very confused, would appreciate any help available!
 
Integral from a to b of f(t) = F(b) - F(a)

Then is we have d/dt (the aforementioned integral) = F'(b) - F'(a) = f(b) - f(a)?

Then, is it also true if we have integral from a to b of d/dt f(t) = f(b) - f(a)?

Does the order in which we integrate and differentiate the same function matter?
 
Integral from a to b of f(t) = F(b) - F(a)
Assuming that F'(x) = f(x) (and that f is integrable on the given domain), this is true by the fundamental theorem of calculus.
Then is we have d/dt (the aforementioned integral) = F'(b) - F'(a) = f(b) - f(a)?
This is not true. The aforementioned definite integral evaluates to a constant F(b) - F(a), so its derivative is zero. You may be thinking of the other half of the fundamental theorem of calculus, which says that if F(x) = the integral from a to x of f(t) dt, then F'(x) = f(x).
Then, is it also true if we have integral from a to b of d/dt f(t) = f(b) - f(a)?
This is true, since it is essentially a relabeling of the first integral.
 
i'm doing a question on proving the uniform convergence of the zeta function defined by the sum of n^(-x) from 1 to infinity. I proved the uniform convergence on [1+d, infinity) for d >0 by applying the Weierstrass M-test comparing 1/n^x to 1/n^(1+d) which is clearly convergent.

Trying to prove/disprove uniform convergence on the interval (1,infinity) now - any ideas on how to do this?
 
I need serious help with this. I want to find the inverse Laplace of F(s)=3/(s^2+4)^2. Not with the use of convolution though. Any help is appreciated, I've wasted a lot of time on this...
 
I have a igcse level exam tomorrow, it is on matrices,functions and various graphs. my highest math grade this term was ninty percent, an a plus my lowest was sixty five percent, a c. wish me luck!!!
 
Guys. I am lost.
This is a task for analysis/calculus home assignment. I did most the tasks except for this one. I have no clue what they mean. Well. If it were linear Algebra, we would nearly have a linear function here. But that's all I am getting out of here.

I have a igcse level exam tomorrow, it is on matrices,functions and various graphs. my highest math grade this term was ninty percent, an a plus my lowest was sixty five percent, a c. wish me luck!!!
Good luck!
 
Guys. I am lost.

This is a task for analysis/calculus home assignment. I did most the tasks except for this one. I have no clue what they mean. Well. If it were linear Algebra, we would nearly have a linear function here. But that's all I am getting out of here.

If you want to picture what a function will look like, work out it's derivate. Treat t as a scalar and calculate f'(x).
 
Guys. I am lost.

This is a task for analysis/calculus home assignment. I did most the tasks except for this one. I have no clue what they mean. Well. If it were linear Algebra, we would nearly have a linear function here. But that's all I am getting out of here.

What's a linear map in a one dimensional space?
 
Guys. I am lost.

This is a task for analysis/calculus home assignment. I did most the tasks except for this one. I have no clue what they mean. Well. If it were linear Algebra, we would nearly have a linear function here. But that's all I am getting out of here.
Take f(x) = ax + b, then f(tx) = atx + b. Suppose b = 0, then f(tx) = atx = tax = tf(x). Hence, if f is a linear function, it satisfies the given condition if and only if the constant term equals 0. To complete the proof, show that f is a linear function with a constant term of 0 strictly from the property f(tx) = t(fx).
 
B2zhUfv.png


For this problem, I was able to find the power series representation easily, but for some reason I'm not able to find the interval of convergence. When I try to use the ratio test method, I end up with an x^2 term which doesn't make sense to me considering the answer choices. I have no clue what I'm doing wrong and being stumped on this problem is weird because power series and intervals of convergence have been pretty easy for me so far.
 
B2zhUfv.png


For this problem, I was able to find the power series representation easily, but for some reason I'm not able to find the interval of convergence. When I try to use the ratio test method, I end up with an x^2 term which doesn't make sense to me considering the answer choices. I have no clue what I'm doing wrong and being stumped on this problem is weird because power series and intervals of convergence have been pretty easy for me so far.

You do have an x^2 term but you also end up with a 5^2 term underneath it.
 
My little sister sent me a math problem and I can't for the life of me remember how to solve it. I'm trying to rack my brain around this.

Problem:
It's a right angle trapezoid with 2 sides that are 4cm and 5 cm long. The bottom base is 6cm long.
-------
I.........\
4.........\ 5
I----------\
6
Calculate the perimeter.

This damn problem! 3 sides and two right angles and I'm stuck. I'm pretty sure you have to calculate 2 diagonal triangles to get at the answer but my brain apparently deleted that info a long time ago.
 
My little sister sent me a math problem and I can't for the life of me remember how to solve it. I'm trying to rack my brain around this.

Problem:
It's a right angle trapezoid with 2 sides that are 4cm and 5 cm long. The bottom base is 6cm long.
-------_
I.........\
4.........\ 5
I----------\
6
Calculate the perimeter.

This damn problem! 3 sides and two right angles and I'm stuck. I'm pretty sure you have to calculate 2 diagonal triangles to get at the answer but my brain apparently deleted that info a long time ago.

Code:
   _____
  |     |\
4 |    4| \ 5
  |     |  \
  |_____|___\
     6

Note the triangle on the right is a right triangle with hypotenuse 5 and side 4. That means the other side is 3. That means the top edge is also 3 (e.g. 6 - 3).

Therefore the permiter is 4 + 3 + 5 + 6
 
12308457_10156383421915372_3174704751246368094_n.jpg



Need to set up a proof to show this.
My mind just turns to smudge when I have to attempt these type of questions.

Any help appreciated!

Pretend the > was an = sign, and you were trying to find a relationship between a and b that made the left and right equal. In this case it's trivial, because the top and bottom are the same with a different letter. But usually the inequalities are more complex.

So anyway, the idea is to pretend they're equal and manipulate the expression algebraically until you reach something that follows immediately from your given information. (b > a > 0). THEN, work backwards to reconstruct the original problem.

I'll give you an example, then you try to solve the original problem.

If a and b are real numbers with b > a > 0, then prove:

Code:
(a+b)^2
---------- < a+b
(a+b+1)

So start by getting rid of the fraction and then just simplify.

Code:
(a+b)^2 < (a+b)(a+b+1)
a^2+2ab+b^2 < a^2+2ab+b^2+a+b
0 < a+b

The last part is obvious since a and b are both greater than 0, a+b must be greater than 0. So now, to write the proof, you do it in reverse.

Code:
0 < a+b                   [Since a > b > 0]
(a+b)^2 < (a+b)^2+(a+b)   [Adding (a+b)^2 to both sides]
(a+b)^2 < (a+b)(a+b+1)    [Factorizing RHS]
  (a+b)^2
----------  < a+b         [Dividing both sides by a+b+1]
   a+b+1

When using this technique, be careful when you "divide both sides by X" or "multiply both sides by X" because if X is negative, then the inequality changes. In this case we don't have to worry about that because a and b are both positive, so a+b is positive, but in general keep that in mind.
 
12308457_10156383421915372_3174704751246368094_n.jpg



Need to set up a proof to show this.
My mind just turns to smudge when I have to attempt these type of questions.

Any help appreciated!

get rid of the fractions by multiplying:
ab + b > ab + a
subtract ab from both sides

b >a

(this is given)

edit: same strategy cpp_is_king used
 
So anyway, the idea is to pretend they're equal and manipulate the expression algebraically until you reach something that follows immediately from your given information. (b > a > 0). THEN, work backwards to reconstruct the original problem.

Just to add to this comment, at first this approach may seem totally wrong-headed -- you're assuming what you want to prove, and then proving the initial assumption. To make sure it's legal to work like this, we need to ensure that each step in the chain of reasoning is a true material equivalence, or more loosely stated, is "reversible". Operations such as adding or subtracting the same quantity from both sides of an inequality, for instance, have this property. e.g. if a < b, then it's true that a + c < b + c, and vice versa.

Once we've worked out how to get from the statement we want to prove to the initial assumption, if we were careful to keep each step reversible, we can turn the argument backwards and get a logically valid derivation of the statement we want to prove from the initial assumption.

To understand why reversibility is important, let's look at what happens when we don't have it. An example of a step that fails to be reversible (in some circumstances) is squaring both sides of an equation. Consider the following "proof" that 2 = 1:
Code:
2 = 1
4 = 2  (multiplying both sides by 2)
1 = -1  (subtracting 3 from both sides)
1 = 1  (squaring both sides)
Although we worked backward from the desired conclusion to a valid statement, this chain of reasoning cannot be reversed, because the last step isn't reversible. Thus, fortunately for mathematics, this would not be a valid proof that 2 = 1.

(In terms of logic, we've established that the false statement 2=1 implies the true statement 1=1 -- which is a true but vacuous implication -- but we have not established the converse.)

The next question would be why we would need to do things in this backwards fashion. The answer is that we don't have to, but in practice it seems to be easier for us to reduce a complex expression down to a simple one, rather than the other way around. Since the statement to be proved is complicated and the initial assumption is simple, this problem is a good candidate for working backwards.
 
Hi guys,

Hoping someone can help me out with a problems I have.

The problems are as follows.

Problem Description: I am given three data points (0,5), (2,7), and (4,7). They are generated by a function that is a polynomial with an order of 1 or 3. There is noise in the system so that when I feed an x value, the y value is generated higher or lower than the true value inversely proportional to the mean-squared error of that y value


1) I have to calculate the best-fit line for the data and the total probability P(Data|Line) of observing the data.

I did the following

Slope Formula
7-5/2-0 = 2/2 = 1

y-y1=m(x-x1)
y-5=1(x-0)
y=1x+5

Using the formula I calculated some additional data points((1,6), (3,8), (5,10)) and drew a line using the original data points(ignoring (4,7) since it doesn't fit in well with the other points) and the additional data points I calculated. Is there anything that I would have to do?

How would I go about calculating the probability P(Data|Line)?

2) For this I have to calculate the best-fit third order polynomial and the probability P(Data|Curve).

I am not sure how to start this one. All I know is the equation (y = ax^3 + bx^2 + cx + d) but not not quite sure how to start the calculation. How would I get the coefficient values? How would I go about calculating the probability P(Data|Curve)?

Thanks in advance.
 
i'm trying to prove a few facts about the Gram matrix if anyone could help.

Firstly, i'm trying to show that if the vectors v1 through vk are linearly dependent, then the determinant of the Gram matrix is 0. For this, I've supposed that (without loss of generality) vk is a linear combination of v1 through v(k-1), then the final row (& column?) of the Gram matrix will be a linear combination of the first k-1 rows so the determinant will be 0. I'm having a little trouble proving this is true - it seems obvious but i feel i'm missing some justification.

Secondly, i'm trying to show for linearly independent v1,..,vk the determinant is > 0. I have no idea for this one if anyone has any suggestions :\
 
i'm trying to prove a few facts about the Gram matrix if anyone could help.

Firstly, i'm trying to show that if the vectors v1 through vk are linearly dependent, then the determinant of the Gram matrix is 0. For this, I've supposed that (without loss of generality) vk is a linear combination of v1 through v(k-1), then the final row (& column?) of the Gram matrix will be a linear combination of the first k-1 rows so the determinant will be 0. I'm having a little trouble proving this is true - it seems obvious but i feel i'm missing some justification.

Secondly, i'm trying to show for linearly independent v1,..,vk the determinant is > 0. I have no idea for this one if anyone has any suggestions :\
To prove the first part, first you can show that if the column vectors v1, ..., vk are linearly dependent with coefficients a1, ..., ak, so that a1 * v1 + a2 * v2 + ... + ak * vk = 0, then the rows g1, ..., gk of the Gram matrix are linearly dependent with the same coefficients (i.e. a1 * g1 + a2 * g2 + ... + ak * gk = 0). Your proof should also easily allow you to establish the converse, which gives you the second part.
 
Hi guys,
Problem Description: I am given three data points (0,5), (2,7), and (4,7). They are generated by a function that is a polynomial with an order of 1 or 3. There is noise in the system so that when I feed an x value, the y value is generated higher or lower than the true value inversely proportional to the mean-squared error of that y value
You didn't provide the error distribution of the y values, so there isn't enough information to calculate the likelihood of the data. I think the bolded part is your attempt at doing this, but it is stated in an incomplete and/or confused manner. A typical "textbook" setup for a linear fit might be that y = a * x + b + epsilon, where a and b are unknown parameters to be estimated, and epsilon is a 0-mean Gaussian random variable with some known fixed standard deviation sigma, independent of x and drawn iid for each data point. Ordinary least squares (OLS) is a common technique for estimating the parameters. In this case it would correspond to finding the values of a and b that minimize the sum of squared residuals (the sum of the squared "epsilons"). This also applies mutatis mutandis for estimating a cubic fit.

1) I have to calculate the best-fit line for the data and the total probability P(Data|Line) of observing the data.

I did the following

Slope Formula
7-5/2-0 = 2/2 = 1

y-y1=m(x-x1)
y-5=1(x-0)
y=1x+5
OK, it looks like you drew the line that goes straight through the first two data points. Unfortunately, this isn't the best fit line. For x=0,2,4, your fit predicts the y values 5,7,9, whereas the observed y-values are 5,7,7. Thus the sum of the squared errors is

(5-5)^2 + (7-7)^2 + (9-7)^2 = 4

To see that this fit is not optimal, I'll give you a better one (actually the best one, the OLS fit). Let's take the fit y = x/2 + 16/3. For x=0,2,4, my fit predicts the y-values 5.333, 6.333, 7.333. The sum of the squared errors is

(5.333 - 5)^2 + (6.333 - 7)^2 + (7.333-7)^2 = 2/3

This is a graphical comparison of the fits:

f5F6Woz.png


The red line does a better job of being close to all of the points, which is the intuition that OLS is trying to capture.

I'd suggest reviewing your class notes and/or textbook, hopefully this tells you what concepts you should be looking for.
 
Thank you boviscopophobic.

With what you explained and some reading I found through google I think I more or less get that question. Any chance you could explain the second question(Best-fit third order polynomial)?
 
It's hard to know how to give you a good answer, because I don't know what background you're supposed to have, or even if you're in a context where least squares fitting is appropriate. This is why I suggested reviewing class materials if possible. Nevertheless, I'll briefly sketch out two viewpoints, the "calculus" viewpoint and the "linear regression" viewpoint. Then I'll point out why this is a bad question.

Let's go back to fitting a linear model, y_i = a*x_i + b. By plugging in the observed data points (x_1, y_1), (x_2, y_2), ..., (x_k, y_k), you can get an expression for the SSE (sum of squared errors). This expression is a function of the two unknown parameters a and b. You can minimize the SSE using the standard techniques of multivariable calculus (find the zeros of the gradient, check the Hessian, etc.) To fit a cubic model, you do the exact same thing, except that now your model has 4 unknown parameters, which are the coefficients of the cubic polynomial. You then have to perform a 4-dimensional optimization, which I don't recommend actually doing. The calculus viewpoint, although elementary, is not very practical for actual computation.

In the linear regression view, we have a column vector of n dependent variables

Y = [y_1, y_2, .., y_n]'

and n corresponding predictor variables which are p-dimensional row vectors

x_i = [x_i1, x_i2, ..., x_ip] for i=1, 2, ..., n

Letting X be the matrix with the rows x_i, we suppose we have a model of the form

Y = X * B + E,

where B is a p-dimensional column vector of unknown regression coefficients (to be estimated) and E is the n-dimensional vector of iid 0-mean Gaussian errors. In the linear fit case, the predictor vectors look like [x 1], where x is the x-value of each observed data point. (The reason the 1 entry is in there is to allow for a constant term (i.e. intercept) in the model.) In the cubic fit case, the predictor variables are four-dimensional and look like [x^3 x^2 x 1].

The nice thing about this view is that the OLS solution is very easily obtained through matrix arithmetic,

Bhat = inv(X' * X) * X' * Y.

Note: The above is not intended to be a complete exposition. The main idea is for you to see if any of the terms look familiar and guide your studies appropriately.


As for why this is a bad question: assuming you've reported all the relevant details of the problem, there are only three data points. As we all know from elementary geometry, two points determine a line, and as we may know from later studies, an nth degree polynomial curve requires n+1 points to be uniquely determined. Since a general cubic has 4 degrees of freedom, this means we can find infinitely many cubics that pass exactly through the three observed data points. In all those infinitely many cases, the SSE is 0. So there is no unique best fit cubic unless we impose additional assumptions.
 
Thank you once again boviscopophobic..

I think I more or less got it thanks to the way you explained both of my questions. My professor unfortunately just removed those question since apparently most of my class did not understand how to do them. I really do appreciate your explanations and that you took the time to explain it. Thank you.
 
Status
Not open for further replies.
Top Bottom