0

I have a MySQL dataset that looks like this:

a     b
.32   .72
.41   .80
.28   .64
.31   .80

And I want to assign values (c) to each row based on the conditions of a and b:

.3 < a < .4 and .71 < b < .83 = 1

.4 < a < .5 and .71 < b < .83 = 2

.2 < a < .3 and .58 < b < .77 = 3

and so on. This would result in my table looking like this:

a     b     c
.32   .72   1
.41   .80   2
.28   .64   3
.31   .80   1

How would I do this? I have tried a case when() statement but that didn't work since I don't know how/if it is possible to have one of those with more than one case.

japem
  • 1,037
  • 5
  • 16
  • 30
  • Use an IF statement if WHEN is not working for you: http://stackoverflow.com/questions/9648481/multiple-if-statements-on-mysql – Aziz Saleh Sep 19 '14 at 22:16
  • I can't figure out how do you get this numbers in conditions based on a and b columns data? – Miki Sep 19 '14 at 22:17
  • @Miki I don't understand what you're asking – japem Sep 19 '14 at 22:23
  • .3 < a < .4 and .71 < b < .83 = 1 -> based on what you use .71 and .83 in conditions? – Miki Sep 19 '14 at 22:25
  • Oh, those are my own numbers. I could've chosen anything. – japem Sep 19 '14 at 22:45
  • Are you kidding us? You want to know how to populate column based on existing columns but not using it yet using something else which is not explicitly specified? Can you please specify question with exact requirements you have? – Miki Sep 19 '14 at 22:54
  • Those are the exact requirements... Other people seemed to understand the question, I don't see what is so hard to get about this. – japem Sep 20 '14 at 22:54

3 Answers3

2

I don't fully understand this: I have tried a case when() statement but that didn't work since I don't know how/if it is possible to have one of those with more than one case.

This will get you the output you described however. I'm a bit lazy/tired so I'll just paste the output from SQL Fiddle:

SQL Fiddle

MySQL 5.5.32 Schema Setup:

CREATE TABLE Table1
    (`a` decimal(2,2), `b` decimal(2,2))
;

INSERT INTO Table1
    (`a`, `b`)
VALUES
    (.32, .72),
    (.41, .80),
    (.28, .64),
    (.31, .80)
;

alter table table1 add column c int;

update table1
set c = 
case 
  when (.3 < a and a < .4) and (.71 < b and b < .83) then  1
  when (.4 < a and a < .5) and (.71 < b and b < .83) then  2
  when (.2 < a and a < .3) and (.58 < b and b < .77) then  3
end;

Query 1:

select * from table1

Results:

|    A |    B | C |
|------|------|---|
| 0.32 | 0.72 | 1 |
| 0.41 |  0.8 | 2 |
| 0.28 | 0.64 | 3 |
| 0.31 |  0.8 | 1 |
jpw
  • 44,361
  • 6
  • 66
  • 86
1

You don't say what SQL implementation you're using.

If you're using SQL Server, you can add a computed column like this:

create table foo
(
  int a not null ,
  int b not null ,

  c as case
       when a > 0.3 and a < 0.4 and b > 0.71 and b < 0.83 then 1
       when a > 0.4 and a < 0.5 and b > 0.71 and b < 0.83 then 2
       when a > 0.2 and a < 0.3 and b > 0.58 and b < 0.77 then 3
       else null // anything that doesn't match one of the above tests
       end ,

)

If you can't do something like that, you could create a view:

create table foo
(
  int a not null ,
  int b not null ,
)
create view foo_view as
select foo.* ,
       case
       when a > 0.3 and a < 0.4 and b > 0.71 and b < 0.83 then 1
       when a > 0.4 and a < 0.5 and b > 0.71 and b < 0.83 then 2
       when a > 0.2 and a < 0.3 and b > 0.58 and b < 0.77 then 3
       else null // anything that doesn't match one of the above tests
       end as c

That ensures your computed c is always in synch with the underlying data.

If you go the update route, that version isn't much harder:

update foo 
set c = case
        when a > 0.3 and a < 0.4 and b > 0.71 and b < 0.83 then 1
        when a > 0.4 and a < 0.5 and b > 0.71 and b < 0.83 then 2
        when a > 0.2 and a < 0.3 and b > 0.58 and b < 0.77 then 3
        else null // anything that doesn't match one of the above tests
        end

Though you might consider putting an update trigger on the table to automagickally update c then a or b changes value.

Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
  • The OP states it's Mysql at the top of the question. Good points though, I just treated it as an one off operation. – jpw Sep 20 '14 at 00:04
1

I would create a secondary table and use it to drive the update.

Generally, try to parametrize things like this into tables, rather than making a big honking sql.

That still leaves you with the question of whether there is a contradiction in your criteria (i.e. multiple outcomes possible, no outcomes, etc...) but is cleaner in terms of sql and in terms of maintainability.

I've handled no outcome/multiple outcomes with exists and limit 1 below, respectively so that at least they don't error out.

drop table tgt;
create table tgt(a float, b float, calc int);

drop table range;
create table range(a_low float, a_high float, 
b_low float, b_high float, calc float);

select * from tgt;

insert into tgt (a,b, calc) values(.32,   .72, 0);
insert into tgt values(.41,   .80, 0);
insert into tgt values(.28,  .64, 0);
insert into tgt values(.31,   .80, 0);

/*
3 < a < .4 and .71 < b < .83 = 1

.4 < a < .5 and .71 < b < .83 = 2

.2 < a < .3 and .58 < b < .77 = 3
*/

insert into range (a_low, a_high, b_low, b_high, calc) 
values (.3, .4, .71, .83, 1);
insert into range (a_low, a_high, b_low, b_high, calc) 
values (.4, .5, .71, .83, 2);
insert into range (a_low, a_high, b_low, b_high, calc) 
values (.2, .3, .58, .77, 3);


select * from tgt;

update tgt
set calc =
(select calc 
from range 
where tgt.a 
between range.a_low and range.a_high 
and tgt.b between range.b_low and range.b_high 
/* limit is to avoid 
if error if multiple results 
- picks only one
*/
limit 1)
where 
/* and exists avoids it if there are no results */
exists
(select calc 
from range 
where tgt.a 
between range.a_low and range.a_high 
and tgt.b between range.b_low and range.b_high)

;

select * from tgt;

in postgresql this is the result:

0.32; 0.72; 1

0.41; 0.8; 2

0.28; 0.64; 3

0.31; 0.8; 1

JL Peyret
  • 10,917
  • 2
  • 54
  • 73