Based on where the disc lands, we output A, B, C, or D. That we have two bounces, which lead to fourĮqually likely outcomes. To add a second level, or a second bounce, so Based on which way it falls, we can generate a symbol. Let's assume instead we want to build Machine One and Machine Two, and we can generate symbols by bouncing a disc off a peg in one of two equally likely directions. On average, how many questions do you expect to ask, to determine a symbol from Machine Two? This can be explained Otherwise, we have to ask a third question to identify which of the Otherwise, we are left with two equal outcomes, D or, B and C We could ask, "is it D?". We could start by asking "is it A?", if it is A we are done, only one question in this case. Here A has a 50% chance of occurring, and all other letters add to 50%. Probability of each symbol is different, so we can ask What about Machine Two? As with Machine One, we could ask two questions to determine the next symbol. We can say the uncertainty of Machine One is two questions per symbol. So we simply pick one, such as "is it A?", and after this second question, we will have correctly Of the possibilities, and we will be left with two After getting the answer, we can eliminate half For example, our first question, we could ask if it is any two symbols, such as "is it A or B?", since there is a 50% chance of A or B and a 50% chance of C or D. Is to pose a question which divides the possibilities in half. You would expect to ask? Let's look at Machine One. Is the minimum number of yes or no questions If you had to predict the next symbol from each machine, what More information? Claude Shannon cleverly Machine One generatesĮach symbol randomly, they all occur 25% of the time, while Machine Two generates symbols according to the following probabilities. They both output messages from an alphabet of A, B, C, or D. Why should I know What is entropy? Where it will help me? or where it helps the world to know about entropy Is this entropy being used in some real world application, if yes then if you could have quoted a real world example in the video it would have been more clear. Is the second sentence produced by machine 2 is more uncertain than machine 1. In the above i would say machine 2 is more informative as it gives more detailed information about the subject I wanted to know. Machine 2: computer is an electronic device which can perform huge arithmetic operations in fraction of seconds. Machine 1: computer is an electronic device. For example I choose that I need information about "Computers" and then two machines gave me information as below: for that first thing I have to choose is for what I need information. To me information is "what is conveyed or represented by a particular arrangement or sequence of things" are we talking about this information here.Īlso I would like to tell what more information to me means. What exactly do we mean by information here. How uncertainty is related to information. Also, my ultimate goal is to also get IG based on a MVN model of more than one trait (i.e., stature and weight).Can't get how machine 1 gives more information than machine 2 just because data coming out of it is more uncertain. I assume I’d have to include the target somewhere. Further, X here is not the target variable from above, it is a covariate age. Here I include a VERY general model that models stature as a normal distribution with a mean function and sd function. Either extracting the results OR possibly embedded in generated quantities? Math aside, does anyone have any suggestions on how the Stan log probability density can be used here. ![]() ![]() Given stature is continuous, entropy is equal to -\int_ f(x,y)logf(x|y) dxdy. To be precise IG = H(x) - H(x|y) or the entropy of the target variable minus the conditional entropy of x given y. My ultimate goal is to learn the information gain i.e., how much more certain are we of an individual’s country of origin given their stature. Here’s the background - I have a feature vector of response variables y that is say a measurement of an individual’s stature and my target or conditioning variable say x is the country that individual comes from. I have a question related to the application of Stan results to downstream information theoretic metrics including entropy, conditional entropy, and information gain.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |